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I 

I 



■SI 
-S2 

S3 
S4 



( END ) 



(2) 



&ffl 2002-1 75091 



[3#fFW*£>IEH] 

[IS** 1 ] < k tf/l/*ft 

oT\ 

£j*k LTf8-r?,rt^^ai-5gig35:^lii^^§fglS^ttl 

±tB^'W*iJBi|I@(c «fc t) ¥"J»J«nfc«««M»c*S UTS 
PftfRO tz ib<D/ ^ * - 2 * Wfp-T 5 / S 5 ^ - * ftiJSPl 
St, 

^#-f § S^figxH i: **TT S c k *«f« k "T S £ * 

Q-JjlcT^jSo 

[IS** 2] ±83«KAtt, il«%rt§«^t^§ 
[IS** 3] ±s5fg8S^ffi^X@ti, ±E««*7 ? /l/ 

* i gB«o«^^ia^}£o 

[IS** 5] ±IE^IS:£te, l?S©fi*SLTl 
L> ±HEA*««fc3^T±IB*ra*7*/l/©:KflB*& 

{bs-ar* c k J; o ±mm\^m.-&t ^mm^^mt 
[is**7] '>&< tt^w^^i/^w-rsfg^^ 

ot, 

«¥"JSi]¥I8k, 

s^k UT«-r«F«js*^-r«8S*«:a*-r*«8SAtii 
t zm? zct z&m t 

Bo 

[IS** 8] ±BBS6BS*«, MS&F^OiT&i. 



C k £f#® k f £ »** 7 fl2M©^ JB^riBfcBo 
[IS** 9] ±I35gIS£tB*¥®«u ±|B^1t^-rVU 

©iB««ii6^m«<oiiBffi*iBA ft 1 1 fc±aes88S£*tn 

[IS**l 0] ±EKK*a^^fift«, JSSSSK^y 
Bo 

[IS**1 1] ±SB38B:fcW\ «a©«3R**LTdt 

^fi^Hrty C k k ? SIS** 7 8Ett©SJ»^l«g 
Bo 

[ts**i 2] ±8Bae±»tt, «Ke$n/tA^«« 

W^-r^k IT, ±f3IM^ceHir&!K«*7Vl/£*f 
U ±HBA^l'lfl$Btca-^V^T±l3®'W^T : VKDt>;Sg^^ 

ft* -e-s c k tc «fc d ±iBKf^^-r saHH^^i/SMt 
?a^§ci: t -r i) is** 7 sb«©§p-&js 

[IS** 13] #»ft«ftfcAailWRf<:»3^T»fl** 

g^k LT«-rsrt^*^-ra8fSX*m^-rssi8SAtB 

*3MSk, 
Kk, 

n, ±iB*ij®$nfc/^y-^tcs^v>T^^/ig-rs 

fio 

[is**i 4] ±iB?g|g3ictt, 

5Ck«r«f«k-rs»**l 3SB«cOD^-y h^fio 
[IS** 1 5 ] ±fBf8IS^:ai^¥IS«, ilB^'W^x 
;l/©«1ttKffi**ffi3£©M«*ffi*.fc k ^tc±SBf«IS^^ 
Hi* LT±E^^^fig¥S{c#t*&-r § c k *«f« k -T § 
IS**1 3&1&<Du#y hSBo 

[is** i 6 ] ±iaais Am*#stt, fgisttfc 5 > 

^AtCtf P,nrc±SBfSlg35C^tB* LT±Ee^fiR¥S 
tc^-T § elk ^Mk-T 5f»**l 3BH«ond?-y h 
^Bo 

[IS**1 7] ±!B5gIS£«, ISWgi^LTS 

g«^#tyck^^k-r?>W**i 3 8Etton#«yh 
^Bo 

[is**i 8] ±mxtimmcm-3^T±.izmm : eT 
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■attend hmm„ 

[00 0 1 ] 
[0 0 0 2] 

[0003] fOJ;^^ nstfy hKBtefflv^tiSA 
XfcOig (A I : artificial intelligence) ti, fill • 
¥iJWH<D*q W*«fi6«r AlWfcSfcH Lfefe©7?fe0, 2 

e AxwKna-r* c 

*5ntt^. C©.fc3*AXfcffi©ttai5^>a3i¥a 

[0004] fcoi^^o^-y hsetfev^ta, -f-co 
h im% a o T i/ ^ * c £ « t # * ^ * 

[0 0 0 5] 

(i) (ii) #teiii;asi*«»)jfi-r, 

(iii) »^:fr#@^T&Sa^RBT*&?K ttZO&ti* 
33 -So 

[0006] fcr, *fgBjti, ±aonwti:a*T* 
JiBBWftjBWiBB* Rial a t s*j*^jattri£fttf : fi»*& 

[0 0 0 7] 



®«!pJBiJlg{c J: 0 WB'JSnfcflRWKfl&cjS LJTB/&-& 

^ ^ — * **jw-r 5 * - * wisuxe 

[0 0 0 8] C©«fc^**^riWffitt, «J*f±{*©SK 
[0 0 0 9] #«Wfc&S«j£f^j#gtB«:, ±3zE 

tt**W8ij"i-*«i(M*J9J¥at, LTa-rsrta: 

[0 0 10] C©<fc3fc«l££«*S§i*^lj!8gB«* 
58S±f*i&« tit ^ f ;l/0!S1ff««l^¥iJgiJ-r s «fltWB'J¥ 

a t <fc o wsj $ n fc jswkabk: *s u r i%<d it #> <d 
Amj^stc * o m*snfcf8is:£#«*a£*u saws 

[0 0 1 1] *«Wfc«SnjJ?y h»«t±, _hizE 

t , £1* txyKDJSI* me^W^iJ-T SjSWWSiJ^S t , 

i: TtjS'&jKo *o ^ v ^ - 9 * mm* %> > <5 p< - ^ 

[0 0 12] C©«fev*«ll«%«IASPJKy>»IHtt, 
«><D/ ^ p< - ^ <9 * - * fflW^SK: «t 0 WIW LT , 

[0 0 13] 
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[0014] ( i ) ^{cfcsjgmagt 

A5Ci:tt, AF^ i: o*fBtt%« fcftt *H«c*J» 

•y h«H*c*5V>Tti^S&K:^ffl-rs«fi|f«:ftS 0 
COO 1 53 -7?> Affl©ftojB1ffi:fg-e6tt5Sj*© 
W WSttfc: tBMtffc 5 if ? AHC O V , Fa i rbanks<D 
#5cr (Fairbanks G. (1940), Recent experimental inve 
stigations of vocal pitch in speech, Journal of th 
e Acoustical Society of America, (11) : 457-466. ) -^B 
urkhardt£><D$BS (Burkhardt F. and Sendlmeier W. 
F. , Verification of Acoustic Correlates of Emotion 
al Speech unsing Formant Synthesis, ISCA Workshop 
on Speech and Emotion, Belfast 2000.) £ 

[0016] ctiz<Dm , gic£%t, mmz^m^w* 
*^ijoTv>5 0 stcmcM^ twin, mmtmLfrftz 

WfcK»*f8SSK:*f LT t, p, "To 

[0017] mz.i£. &&Atm t ) j p%ti j p&zf%iita. 

yl/^— J&itoctfuJ&So *fc&3A#ilJB J f , *KL#* 
[0 0 18] 0*AfcT^'J*A{clti»(D^>* 

eafK»*ns^©mft*fTo fctsxtfAbei inp>o$s^ 

(Abelin A., Allwood J., Cross Linguistic Interp 
retation of Emotional Prosody, Workshop on Emotion 
s in Speech, ISCA Workshop on Speech and Emotion, 
Belfast 2000.) TickleCQfg-g (Tickle A., Englis 
h and Japanese Speaker' s Emotion Vocalizations an 
d Recognition : A Comparison Highl ightingVowel Qua 
lity, SCAWorkshop on Speech and Emotion, Belfast 2 

ooo.) fc&So co.t'SfciaStea&sef&ft^ (0 
HBOjt^fc £ o Tmm<Dmmmmt> e»&v\ a i) 



[0019] cne»cow^iss^#iS'r?>i:, AMfcffO 
x.(i"a^'-y hmwt<Dmc, m&D&mit&mt t&<^ 

[0 0 2 0] *f§W<DHS8iO^ST{i, 

Bm?li, (i) Xfcf-^OJ:?** (ii) IcW£f#fc 
&V\ (iii) *5le]jg? N ^^iLTV^o 
[0 0 2 1 ] El Hi, *JSWl:«*f^iJW 

lti^ai cftnnsjesft-r, sin#-y Mga^ 

O^-y hJ-Xy1-0#a=i>lf i — £ A I (artificial int 
elligence) ^©jgffltBJtlTfeSC fcti^T'fe 

[0 0 2 2] C(Dia HCfcVT, «^Z)C0Xx-y7 P S IT' 

Wfcti, 0U*.tf % (WWSH) ^rtffiOttflS 

-So 

[0 0 2 3] CCT\ CJtf-y h8iO«^Ktt, 

[0 0 2 4] ^IftC^Smt/^L^cDa^tTffil^CKD 

[0 0 2 5] £<DMfomi£mjg.-£tl2> 

[0 0 2 6] ^07f7/'S2tU *j5fcLT52-rS 
rt^^-TSIiS^ffi^-rSo C(DXT"-yys 2«, X 



(5) 



mm 2002-1 75091 



^fc^bT^S-fedte^b^se:*:*^*, 
[0 0 2 7] C©£?K:C©Xx-y7' , S 2lc:fc^Taj? L J 

sfts«K:fc«, 7^^ft#i"Pt)«Sh5i-eJ& 

5Cfc{C«fcoTHSlbT<,>3 0 CCTV^^fiPttis £ 

gL<iiccvtucfe©t-fe5. nffimmvte. s 

»© ! PJBiJI8*fc:J& b/i £ r> ft? <9 * - * ©S'JiPtc «fc 

[0028] *se»ojBa6fc*vTH:, a*?<*ft 

S 1 T©«*«M©«&Jtt*fcJ£bT, SP-^titOfefe 

W««S©<ii:-pfet), !S«««SOWgiJ«es^ Mill, ¥ 

fctt, ±5aSO«HiJ«S*tbTO*«« (¥», St), 
b#, SO 5 , K*ti&r 

fcjtS bT c ft e> Of- ■7";U^ D 3 c t tf3£if 6 ft 

[0 0 3 0] ^Xf7 7"S4m ±3ZE©Xr--y^'S 

-f-f : speech synthesizer) ICjMO, ±3zEOX'r , y ^"S 
3T*iJSS?ftft/^^-^fc^oT^-&fig-r?)o 
^fig^ftTtfe.ftfcem^Jf-^ti, D/AERS 



Jf»Tyyse*^-bT^if-*K:j3S6ftSCfcftJ;0, H 
BR00*)»fcbT««6ft«o #'J*tf, P#7 fiiTfe 

#>y hK3o^T#£ftTXt?-7JfrP>, ^©t^O&W 

[0031] vA±mwLrc*Km<D&*wi%:nm<Di&m 

m) zvm-rzct?\ mmmmit^ft^tcnm^ 
n < , ^-ft-e^TKaa©* 5 tceac ?u 

ffi&/<7*-*©-aff*v>*\Afc:gHbfc!K 

[0032] c 2 ) mm t mmMmtD-gf&T ^ v xa 
mm t nmx tftz mmm t<D^T^vxL>ic-z> 

ttnrctz^ m&Mo, t-e&Za 

[0 0 3 3] £<D£oft3$f&±(D3iJ2t<Drctblc^j£-e;tfi 
§§ (Xt!-^i>>^+M-»f : speechsynthesizer) , h 

— T^V-te +f -fif : speech synthesizer) , Sb^hHt^ 
^^fA'NOA^tt, WSB© UX hi: ^ft^'ft© 

(»« m m ic n? z > tr > t— iP v mm) m v & % . 

[0 0 3 4] (2-2) fglSXO^ 

>^A*SHSfC«fcoT«l«bTV^So CCT, ^fiPCi, 
tlt'fe^f f C fcSf Vt»^T^»)> CV 

T\ SiRtt, 'JXhfcbT*fbTV5o fit, UXh 

•y^toT5t-r^?ftTl/^o 
[0 0 3 5] 0IJ^.(f, fe^^iR rbj It T448 10 150 8 

o issj t^-ordmc&^TmmztiT, uxhtceti 

$ftTV^S 0 " 448" {±, TbJ (Dftffimffl 

*M48ist»feSC4:*^bTV^o Sfc, 10" R 

Zf*<D$!.<D" 150" (i, ^B#R3448msCO10%-C : 150HztC 
S'JjS-rSut^bTVSo Sfc, ^<D^K<D" 80" Ettf 
^<D^.<D" 158" ti, ^«5tB#P^448msCD80%iC til58Hz(C 
SUiS-TSCt^bTl/^o COi^tcUTU^hOt 
^T©«^^a?g^ft?)o 
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[0 0 3 6] mziclt^ T131 80 179J fc<J;t)-^x.5>n 
&^fft TbJ Hi 20 200 80 229J (C J: 0 %X Zft 
5«* T@J fc, T405 80 169J fC<fc OMStlSf^ 

TbJ fcOte^fCfcoTaJK^nSBfilS^LTl/^. 

[0 0 3 7] c©j:5*«»*#lrt-r*fe«>fc**** 

*^!SHl»^fc:fefr^TSM£iina.6ftSCi;-C, HIS 

t«BIID^if*y^35rt^«H©fc*i>li:aaESti*. 
[0 0 3 8] c©J:5ft*iRfr6«liaW*58l£S:«, * 

*/«-rS#a»S< 1 >~< 5 >lC3Slf&9llWlC-O^T 

< i > 5t-r, *©*<D¥i§oaj&»i6So mar, 2 

0~MAXWORDScDP^cOSL^t LtftSt^o MAXWORDSti, 
;^^< — $T'&Z> 0 

<2> *JWH**l«-r*. IH*Wfc«\ $tf, 
{cfeVT#SSfc7 , ^-fe>'h*^S*^'5^«:5t^ (PROB 
ACCENT) T;ifc^r£ 0 
[0 0 3 9] Ji(T©<3- 1 >~<3-7>©3M&#fc: 

< 3 - 1 > &J*BO4<0tf ffiOffc£fti{>S. flUAfcf, 
2~MAXSYLL©ffl<DSLSJi: LT^tSo MAXSYLLte, * 

<3-2> CCT\ 7 , ^-fe^h^a6*#STfe** 

<3-3> *S»*C VilfrC c vgaofeot L. 
TStSfSo W*fcf, CVX9iQ«flKtt).8%®flk*-? 

<3-4> *0±5KaA/«CV^Ij;CCVOCft 

<3-5> ^*©J#MWra ; 5r> MEANDUR+ randora(DU 
RVAR)T?ftW-T5o lcT% MEANDURtiH^O^f^KT 
random(DURVAR)tm&&C&^£n£>firC«2&3 0 C 
CT\ MEANDUR&t/DURVARfi, ^-^J&CDfc&cD/^^ 

<3-6-l> ^CDtr-y^tDftW^r, e=MEANPITCH 
+ random (P I TCHVAR) Tlt^ £>« CCT, MEANPITCWi 
HS<0 fcT <y T £ 0 , random (P I TCHVAR) tt gL$! »C « 
n&iiT£3o ClT% HEANP I TCHR Zf? I TCHVARti N #J 



<3-6-2> CCt\ e-PITCHVA 

e +PITCHVARi: LT§^©lf y ^*#So 
<3-7-l> fcU *«JE 7* *-fe>h 

tfMwratcDURVAR^ii*n-r?. 0 

<3-7-2> ^LT, T^-feyhtffcSW^fcteV 
T, DEFAULTCONTOUR = risingT&§i:f?fcWu ?S<Dkf 
y?-*, MAXP ITCH-PI TOWARD U S^©fc?'y^£:> MA 
XPITCH+PITCHVARi:-r§o DEFAULTC0NT0UR = fal 1 

ingT?&5i:fffCt±, ^©hT-y HAXPITCH+PITCH 
VARfcU SfOtfy?-^ MAXPITCH-PITCHVARi:-r 
2>„ ?f.(C, DEFAULTC0NT0UR= stableTfcS *fC 
fi, ^Rtm^tf-y^, MAXPITCHi:-r?.o DEFAUL 
TCONTOURte, ggi5©tf'l4 (CONTOUR) J&jjrrt©^ 
£o Ufc> MAXPITCHJi, g^-g-^OfcibcD^^^-^T 5 

[0 0 4 0] W±©*5*<3- 1 >~<3-7>©# 

T, fM&(c£§g£;£©||^C < § ^©contour (tiff) 

< 4 - 1 > &*©s&©JMgf<: r * -tr > h frmvn 

fi\ e = PITCHVAR/2tC-r^ 0 CCT', C0NT0URLASTW0RD= 
faling"? ^SfifitCOVT, -(I+l)*e^rAP 
A, e=etett§„ KiWSRO^'r'y **%jjVrt>© 
$fc> M^^-^CONTOURLASTWORD=risingT* 
£>£i:#k:fct g-^gflfcO^T, +(I+l)*e£r/ra;^ e=e 

<4-2> -7a, m&<DmmicT?-iz>hfr&tii£, 

CONTOURLASTWORD = f a 1 1 ingT'&S i: ^IC«, &«f|J©# 
IBBWHfc:, DURVAR£:ftn*_?> 0 ■?■ lt v ^©tf-y?-^ 
MAXPITCH+PITCHVARt L, ©§©fc?«y?-£\ MAXPITCH- 
P I TCHVAR fTSo tU — *C0NT0URLASTW0RD=r 

ising-pfc5i:#fc«\ fttifffOMPMmiWlK:. DURVAR^r 
ftP^.So fit, ^(Dlf^^^r, MAXPITCH-PITCHVAR 
fcU S^©tf>y^^:, MAXPITCH+PITCHVARt-T^o 

< 5 > S^tC^^cD^U a— A^rVOLUMEtCtS^f 
^>„ CCT\ VOLUMES. ^^OfeftO/^^-^T 

[0 0 4 1] JjLhOJ:'5*#aRtSfi:fettS«iati:«fc 

aKA^riisnaidtcft*. fit, assxo 

[0 0 4 2] ftfe, ®3Stf04fC« > ±JdELfcJ;^^ 
(V-X3-F) Oiea!^LT(/^o EI3tc«ray 
[0 0 4 3] (2-2) Si«1tfCJ5Si;T#*en*-'<7 
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HL^> W£f$&"'l£.Wi~%^m<* (calm, anger, sadness, 
happiness, comfort) %?tplg<<fz>tlZ> 0 &*5, CCT^] 

[0 0 4 4] mm£> ctDko&mmtezti^ti. mm 

(Arousal) till® (valence) t *mmt?%W&Qm 
(Arousal) tilM (valence) t^mmt^^m^EfS 

±icto^r, as??, mry^^m^m^ (an ge 

r, sadness, happiness, comfort) ©rcW^^HifiSc^tU 

^(DtpMcw-B (caim) comm^m^n^ti^^d 

(CTfeSo m*.l£. W!) (anger) J fi«Bt*#-r 
^7i:Lta?n> [«L^ (sadness) J It rm&T* 

[0 0 4 5] tTFOSKfi, SO, gtm^lg 

-2 ('>^<tt^cO^Wra (DUR) , tf-y^ (PIT 
CH) (VOLUME) I?) Offl^f-^l/^SLT 

t^3 0 COJ; 5 ^-^l/^^tS^K^S^VT^- 

[0 0 4 6] 
[«1] 



(Calm) 






LASTWORD ACCENTED 


no 


MEANPITCH 


230 


PITCHVAR 


10 


MAXPITCH 


370 


MEANDUR 


200 


DURVAR 


100 


PROBAC CENT 


0.4 


DEFAULTCONTOUR 


rising 


CONTOUR1ASTWORD 


rising 


VOLUME 


1 



[0 0 4 7] 
H2] 



(Anger) 






LAS ri WORDACCEN 1 1£D 


no 


JtylCAJNlrl 1 tn 




PITCHVAR 


100 


MAXPITCH 


500 


MEANDUR 


150 


DURVAR 


20 


PROBACCENT 


0.4 


DEFAULTCONTOUR 


fating 


CONTOURLASTWORD 


fating 


VOLUME 


2 


4 8 J 




] 




JK L ( sadness ) 






LASTWORDACCENTED 


nil 


MEANPITCH 


270 


PITCHVAR 


30 


MAXPITCH 


250 


MEANDUR 


300 


DURVAR 


100 


PROBACCENT 


0 


DEFAULTCONTOUR 


fallng 


CONTOURLASTWORD 


filling 


VOLUME 


1 


4 9J 




] 




J8r"£jf# (conmfort) 






LASTWORDACCENTED 


t 


MEANPITCH 


300 


PITCHVAR 


50 


MAXPITCH 


350 


MEANDUR 


300 


DURVAR 


150 


PROBACCENT 


OJt 


DEFAULTCONTOUR 


rising 


CONTOURLASTWORD 


rising 


VOLUME 


1 



[0 0 5 0] 
[g5] 
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Stf (hapiness) 








t 


MEANP1TCH 


400 


PITCHVAR 


100 


MAXPITCH 


600 


MEANDUR 


170 


DURVAR 


50 


PROBACCENT 


03 


DEFAULTCONTOUR 


rising 


CONTOURLASTWORD 


rising 


VOLUME 


2 



[o o 'b u rvn^jttTTzm wuristt&mffi 

icttm2ftz>si~> *-$frP>%:Z>7— >0L£\ UPg^iJ 

[0052] •?■ it, j eo«fc'5ti:aR«tej£i;Ta«?sn 
5tc&s 0 *urc©«t3K:^snfea5«a^a* 

P*-y bmW<Dmm%%12>C£tfT*%2>£olcl3: 

[0 0 5 33 fc}5, HSBOJgJg-ptt, tftttOftCfc'*? 

[0 0 5 4] (3) ^afiOJBflgtC.fcSDJKy hSHO 
(3-1) P#-y hgiOifilc 

p # >y h SB© V7h')i7 tcffffi • *iltfM9X 



[0 0 5 5] R*«i:UTOO#y h«KH\ 06(C/T 
f«fc9tCv TAJ *«Lfeje«?OVt>«>S^y bPtf'-y 

P.--yh3A, 3B, 3C, 3 DtfJUSSftSfcttK* 
S?ft:a53.- >y h 2 ©miilggPR^iSgPfc^n^naRgprL 

^•y h Ajkifmm^-v v s^mm^nxm^nr 

</^So 

[0 0 5 6] JPffcgfljx^-y h 2 Kit, I7t^tj;9 
fC, CPU (Central Processing Unit) 10, DRA 
M (Dynamic Random Access Memory) 1 1, J^yi/z. 
ROM (Read Only Memory) 1 2, PC (Personal Co 
mputer) — 7 x — X0SS 1 3RtHt*fSQ.3 

mssi 4 jb^gp^x i 5*^LTfflSfc»as*nsct 

tCfcOJfM^tl/iPVhP— /l/SPl 6i:, lOO^ h 
SSI (DW)J]M£ LW^yf'J 1 7 fctfiRtfiSnTl^ 
So £fc> flB{*gp:x:=.y h 2fCti, n#7hgilO|S] 

[0 0 5 7] $fc, Hgfla-v h 4(C«, 
ja#-r?.fci6c0C C D (Charge Coupled Device) ii * 

72ot, mm%fr*><D raiTSj ^ rnp<j i^ofc 

•y^-fe>-9-2 1 t, S9^fcffiH-rs%«*TOIE«*»J 
^•t§fc46©Sg^-b>^-2 2 i:, ^BPS^ifeWrSfcii) 
OY-f*B*>2 3 RSt^tD^^tB^-rSfc* 
©Xtf— # 2 4t, Pt>7 hgg 1 O Tgj 
LED (Light Emitting Diode) (H^-Sf) t&ifft* 

[0 0 5 8] ^Pg|5n->y h 3 A-3 DcDHaifi 

gP^&PgPP-^-y h 3 A-3 DltXfm#&3.=.y h 2 
OS-SBISSP^, m^=L=- >y h 4^riffii*g|5a-<y h 2 co 
ji«SaP», Mt>*(CK«gPP-->y h 5 (DKS 5 AOa^gP 

^ifteti^-n^nesasst^oT^^ax-^ 2 5 , 

~2 5 n R?>M^7 : ->v'3^-^ 2 6,-2 6 n #HBaft* 
ntV^S, 7yfaX-^2 5,~2 5 n (W 

»fc«fe0, IHlgPP--'y h3 A~3DjWMWSttT, 

[0 0 5 9] Lt, Cin^^iSje-trV-y- 1 8, iiPiSJif 
1 9, ^>y^-b>+)-2 1 , 8§g6-tr >"f 2 2, v-T 

6,-26 n ft4f©^&«-trV9-MmcL E 

5 , ~2 5 n t4, ^-n^nm-r?)^^ 

2 7 ,-2 7 n^^LTn^hP-yLgpi 6©«9JEa 

iuss 1 4 tmmzti. c c d*^2 oRtf/<yfy 1 
7 *ti^n«^«raigis 1 4 tfemmmznT^ 

So 

[0060] is^saa(e]g§ i a it, ±^co^y^-t^ 
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drami lrt^m^ffiHfciR^ttiWj-rso ztzm^m 
sassi 4(4, Lni:Wcn>yf u i 7*^«<esn5 

/^f'J Bifida*- / U iff- £ «rl«^ffi 0 & 

lMdrami i rt^m^esfc^-r-So 

[0 0 6 1 3 l^^CLTDR AM 1 1 ICfafflZntc 

lif- 21*. C<D'&C P U 1 OffC<DU#y hi£W 1 

[0 0 6 2] mM±.Q PU 1 0(4, ntf-y hSKlOtt 

hfcaW^n/fc^fJ*— K2 8X(4 
777->iR0M 1 2£«iMSnfcWJ»:/n$ r 7A*P 

»ffiU CMDRAM1 1 tcftW-rSo 
[0 0 6 3] Sfc, CPU 10(4, C<mk±M<D£o\Z 

\m®mmm \ 4 £ k> d r a m 1 1 icmx.fem-$ti&& 

[0 0 6 4] S etc, CPUlOa, COWWISSRtf 
DRAMI 1 {^^SIfiLfc^^^l^^^S7 ,, ^^^i^^cS-j•^^TM< 

7^fai-^2 5,-2 5 n ^IKl»)$-&^c:i:{c £ k 

— vb5 (OKM 5 A JfelbrtHirfc D » #Pgpn. - -y h 3 A 

[0 0 6 5] CO^C P U 1 0(4, j&BtCJSCT 

f^fftLT7tf-*2 4(c#*.;g>c£:(c4;DiiI£g ; 

pmmcm^<m^^Hmcm,t}^^rc^, hole 

[0 0 6 6] C©4o(cltCODtvy h^I 1 icte^ 
T(4, SSRtfJSBOttiR^ filffl#A^©*§^Rtf« 

[0 0 6 7] (3-2) Si)ffl)7 p D^^Acoy7 b^xZ 

ov7 h*>xTflifig{4, 0 8(C7rcr<£5(c&;i> o ^00 

8(C£;^T, f/^X ■ K^-TM • U^t3 014, CO 
Kv'l'^frPiaSf/WX • K^-fVS • -fey h 3 1 fr£ 

M?nn^o c©*§-g\ &tWx • K^f ^4, 
CCDA^720 (07) ^>^^^m(Dmw<D=i>^a. 
-^Tfflt^nSA- K^xZtcUST^-trX-rscri: 

[0 0 6 8] %fc, P#f^ >y£ • -9— /* ■ *7~iSxt 
h3 2(4, f/WX- K7-f^- W-V3 OO^TfuJf 
fcffiHU m(4±i$?>&ffl-fc>-9-^>7 7^xX-* 2 



5,-25 n ^CD/A- K7x7(C7^-teX-r&fc460-f 

> * - -7 x - x * m it -r § v 7 h x 7- s¥ t* * s > * - 

■WU • ntf<y h33t, ttMO #l&;t & if ^SS^ S V 

xVWX • • t*— i^-v 3 5i:> a^-y h^H 

n#>y h 3 6 fc*>£#|JiKSnT^5o 

[0 0 6 9] • 5 h 3 7 (4, *7*J 

x^ h • V*->>> 3 8&t>*+»--trx • V*— i?\ 3 9 
J^6*|J«**TTV^*. *7*^i^h • *J\ 3 8 

(4, P/ff^yi'-t-^.^7'i'xi'h3 2, 5K>1/ 

•7x7' W-\M 0, RZfTZfVfr— ->3> • U-f-V 

4 1 ic^sns&y? h*x7i¥©g»'*H7* , &a-r 

SV7b')x7i1?fe0> trx • v^.— ~» 3 9 
(4, **y*— K2 8 (07) (cf&*rtSnfc=J*^>a 

ffiica-zsv^T^^^x^ h<Dmffi*mm-?zv7 v-v 

x7Sl?fe5o 

[0 0 7 0] 5 • 7x7 • W-^4 0(4, n#x-T 
>y * . -9-—/^ . 3j-y^ x ^ h 3 2C0±{4H(cffiBL, H 

id^Httt^ v 7 h ^ x /Sfr^M? ntv^^o * 

fc, 77 ,| Jt->3>- Wt4 1li, 5 VJl - 7x7 
• 1^7-^4 0<D±fit«K:{fiBU ^=K/P"7i7' 
4 0 %mi&-? 7 h •> x J: o T^S^ 

[0 0 7 1 ] **5, 5 HVU '7x7- U-f*4 0M7 
^U-5r— */g>. U^^4 10flftfty7h')x7i« 

4fnfni 9 tc^-To 

[0 0 7 2] $K;l/"7x7- U>ft4 0(4, 0 9(C^ 

p^tsiiffl, mmmtam, £m&&m. ?7fty^, 
If ^tBfflRtffe^ifflO^f ifflit^i-^s o~ 

5 SMicA^'fevyf^ ^xny^-^y'a- ;b5 
9*H*W-r*B2SI^6 0 ttl^yf^xny 

^*->*jl— ;U6 8MtfK§BfM5affl, bf>v*>tf 

ftmRzf : gn£m<o»mmm : ei?3.->v6 i~6 7§ 

[0 0 7 3] BMGR6 0OS«f«ai ; &ya-;l'5 0~ 
5 8(4, ot*f^ >y^ • -9—^ • t^xi' h 3 2 OA 
-^■Wl/ • D/1?«y h 3 3IC£9 DR AM 1 1 (07) * 

^^X3yM-^ti>'a-^5 9(t^^ 0 CCT', 0IJ 

^.(4*, /s— • a#<y h 3 3(4, m^Oj§{t#i^(c 
4:oT, e^OSSa8i/^«as»*-r*W»i:LT«l«« 
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[0 0 7 4] AA-t?v>f-c ^Xny^-Hi'a-il/ 
5 9ti, cn^i^ftl : tv'"a-;l'5 0-5 8fr?.# 

~>3>- u-c-v4 1 (0 7) icmt)-?z> a 

CO 0 7 5] 7/!jy->3> • \y^"Y4 Hi, 010 
tc^l-i^t, tTl^xVl^-f 75 U 7 0, 
^J^—)V7 1 % ¥S€-^a— ;l/7 2, J8«€-r;U7 3 
St/^fji^T 2 '^ 7 4<D5 0CD^^-yl^£«JiK£flT 
^5, CCT, gSt€f^7 3^ J1.»^6©*jattfK: 

jtS i: T±a> L <k 5 *«KA'N©»«a3S«)fi»^* $ 
ft3 0 3; ft, £ <D<fc 5 ft^M^E-r;!/ 7 3 -^fil^-r A< 7 
4*©«tOia, "T fcfol5*©#ffi©!pjgijt$«, CP 
U l 0l¥©»IW^Sk:J;oTftSnSo 
[0 0 7 6] ffSft€"r^7-r77U 7 Otcti, 0 1 HC 
<fc5tc, r;Vyf'JSl* , '>a<«:oft»^j , 

nEffjimrrsj , r»#Hs&i§i3Bf . mm 
«>b w * n rc t, > < o *><D^ft rs g n -rn wjc * * 

Tn *tt^tt&:siLfc?Tl!l ; E7 i /l>7 0,-7 0 n wmf 

[0 0 7 7] LT, cnP.fTWj^-rVl/7 0,~70 n 
ti, ; en^tlA*-bT>^i' ^X3yM-^€i?a-;l- 
5 9 fr6KW*S»#-5-*. fcftfc t * «%£>SaK*3£ 
A 6 nT3ft» 6 — *^ifl L fc t # ft if , £>B 

tc^cT^o «fc 5 icmm tf^ 7 3 icGg^n-r^s 
^-^7 i tctB^-f 3„ 

[0 0 7 8] ftfc\ C OUSBOJ^IiOJS-g-, ^frSj^-r 
;W0,~7 0 n a, ^<0fTi%ttSt5¥ffitLT, 
01 2{C^f<fc'5ft 1 ooy-H (tffg) N0DE o ~ 
NODE n fr&ffi<D£<Dy — KNODE 0 — NODE n 
lcM&-?Z>fr*&s / — FN 0 D E 0 — NODE n icr3% 

m^nrcM^mm? , ~p n tcs-c5vT»*wfcS^-r 

S *|5I6S^4-- h- t h > i: nf (in« 7^1/ =r y X A fcffl v ^ 

^) o 

[0 0 7 9] #|{*WfC, Sfrl&efVW 0,-70 

n «, ^n^n^B^nwj^jv? o , -7 o 

t5/-KN0DE o ~N0DE n K^tl^tlW^** 
T, cnfiy-FNODE 0 ~NODE n a(C|13 



[0 0 8 0] C<D«88j1^8 OT'li, fC/-FNO 
D E 0 — NOD E n K.1S\,'>TM1&kWrt-r&\t}'r'<>' 

[0 0 8 1 ] Ltctf-oT, 11 3<DmiI^i8 0fi 

?n§y-KNODE , 00 Tfi. r^'-;u^m (b 

ALL) j fci^S SttfcW^fc, SKIS 

S»l8ai:ttfc#*&ns*OJK--;i/0 r**s (s I z 
E) j # ro^p>ioooj oSBHT-feSdi:^ rmmvn 
(obstacle) j t^vwstttimff&z.'o 

rgggt (D I STANCE) j # roA^iooj <D 

mm t s c 4: ^fa© y - f k. m&? s tc n> <osm t ft 

[0 0 8 2] Sfc, CCD/— F N 0 D E , 0 otftt, 12 

it^soA^^ftv^^tcfe^Tfc, nm^v-'/i i o , 
-7 o n tfmnm^m?z>mm*:7*)i<7 3sm^ 

(JOY) j, rit (SURPRISE) j£L<li 

rjKu* (sudness) j o^fn^o/<7^-* 

[0083] $ fc, wmm®m 8 o -m, rt©/- f 
^(ommmi <Dwicmt% m^fty-Fj o^jtc^ 

cDy-FNOD E 0 ~ N0DE n ^5I8ttS/- 
T©*ff*«a»o/S: t ^tC«^T-#Sffi0^y- F N O D 

e 0 ~node„'noI8WW rfluoy— K^Olg; 

y-FNOD E o-NOD E n ICW&? Zmctiit}-?^ 

(Dm&wm <DmteMfz>&tf<Dffim<Dmz \ o o 

[%] ilftoTI/^So 

[0 0 8 4] LrctfiT, 11 3<0«llgg8 0fl 
SftSy-FNODE j 00 T?tt, fiJA(i r*— 
tB (BALL) J U ^©3}?— r S I Z E (A# 

$) j & ro^eioooj (omn-eibzt^omm&zktfi 

^Z.Zfttzm&tCte. T30 [%] J OSSJfST ry-FN 
ODE i 2 0 (node 120) J Klgtf, ^<D£:# r A 
CT I ON 1 J <DftW}&ti\J3Zn&£ttl3:&o 
[0 0 8 5] ^m^jV 7 0 , — 7 0 n «U ^tl^'ft 
C <D cfc 3 ft 8 0 £ LTgBMS flfc y - F N O 
DE 0 ~ NODE n *^^<OfeR^a«t3fCLT«fig 
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5 9*^KW*SS*^A6nfefc#*4!f!:, fcfjfrr*/ 
— FN0DE o ~N0DE n ©ttflg»#S*fUffl LTBt 

[0 0 8 6] @1 OtC^-rtTW^^^r^a.— ;!/7 1 
^ ffK^x^v-YyvU 7 0 £D=&tf i&^fVl/ 7 0 , ~ 
7 0 n fr& ; tft j efta**ftS?T»'<0'35, ^46^466 
ftfc«ftlEf£WJft^fTi&*'r /1> 7 0,-70 n ^P»ttJ^/ 

F CWIn Ctl^ffl^yKt^^ ) 45K;1/--) 
It, mi Hc^XTWcmi-£tittffWi ; e j rfl7 0 , 

~7 o n aif«$tJWfflft , «?s<ias«nTv^So 

[0 0 8 7] Sfc, fTWffl&tti? =l—)Is7 1 ti, ff») 
[0 0 8 8] —7?, ^fti>'a-;l/7 2(i, AM^y 

[0 0 8 9] LT, ff€>'*a-;W2tt 1 C©&« 

E[t + l] = E[t] + kexAE[t] 

[0 0 9 3] &33, SagaMSS^ffi^-fevVx-f^Xn 

*{GlcD§£»SAE [t] H:ifOS«0»«S:^AS*H4 

[t] nrc&ftfcj fcv>ofclg 

[t] lc**fci&«;&#*.S«J:5tefcoT^So 
[0 0 9 4] CCT, tb^-bv^-r-^ 

TfcjBMJ&sats-ii:*,, cnti, rwx.^,j t 

v - o fcfr»fc <fc t) & D <DS51ft U^l/A^F i: v > -3 fc «t 

t^a— ;i/6 8*->e><oa*n{i, laifcfst^'a-* 

ifiatcS^^Ttr Wjtv*^ 7 0,-7 0 n <D*ffS-f 351 
[0 0 9 5] ftmit&%L<D7 4 — K/W*tt> *T» 



7 OtC*3tt?>^-r«.tf») ; ef : Vl/7 0 , ~7 

[00 90] flfe*, S**^ 7 3 rga* (jo 
y) J , rggL^ (sadness) J , r«j£D (anger) J , 

(surprise) J , mM (disgust) J RZf r& 
n (fear) J CQ-&tr 6 OOW ftfcO^T, C* i: fc 

5 9*>e-^^p>ti^ rpp^n/cj r«n-ee>nfej & 

<£>o 

[00 9 1] Hft&tycti, JSfflHerVl/ 7 3f±, A^-trv 

y-rw ^X3>^- **5>a— ;i/5 9^6-5-*. snag 
Kta%2:. *<D£%(Du$.y hmmi <Dfimt\ turns 

[tj > m&<D*(Dffim<Ds^ *-$mzE [t] , ^ 

(DWWl©«fi*af k e £ LT, ( 1 ) 

T^©fflWte«5»tS*<DW»I<0^"7P<-^ffiE [t + 
1] ^r@tbL, Cft*m&<D*<DmWl<Drt<7*-*mE 
[t] £WZ&Z.2>£ol l CLT : £<DmW)C0'^X—$tm 
ZMmir% 0 %rc. 8itf;l/7 3tt. c:ni:[H)«tcL 

[0 0 9 2] 
OKI] 

... (1) 

id) icj;!)^?n5t©r6oTfeJ;K 

[0 0 9 6] — 73, #fc£ j e7 :: ';l'7 4ti, r®®]^ (exer 
cise) J , rg1tg>; (affection) J , (appetit 
e) J RXf r^f^;^ (curiosity) J OSVMC^it Lfc 4 

oco^tcov^T, c nearer i:»«:*©a)(«oai** 

7>f^X3^Wt>/a-il/ 5 9^e>-^X.P>tl§ 
BWISIR^, «3aii*mRt«flW!»* ; es'a-;U7 1A> 

[00 9 7] *f*fl*)fC«, ^ffi^E-r;!/ 7 4 tt, rS») 

-;l/6 8^p.cDjiSa^iffcao"v^Tm^O)^S^(cJ;t7 
gm^n-g.^Oi:#<D^-^^5R<7DS®)ffl^A I [k] , 

sawfoaso/^^-^Mi [k] , ^©a>;*o 
»s*a-r«»k , £ lt> mffinw-e (2) s^ffl^ 

T^OJH^tcfc^S*©»[*«>/<7P«-^fit I [k + 
1] *JStHU Z<DffiMm$kZmtE<D*(Dffl.&<DJ^* 
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[0 0 9 8] 
DK2] 

l[k + l] = l[k]+lrixAl[k] • • • ( 2 ) 

[0 0 9 9] ftfc\ KMfiSS&O t ai*"b ,; ?>'r-f 
>M-?ti;a-;l/6 8*^ ©iiftlft if &8K5R©; S => 

^-^fot»iAi [k] fce<Dm&<DKW**3-*.& 
£^3.— ;i/6 8fr5©a»H±* mgnj ©^ 

^-^iofllAi [k] fc^:f?aR««^s«J:5 

[0 10 0] ftfc\ *HSgcDff^C*5V^T(i, 
O^SX* (#*&) ©^5*-*fitf^ft**ft0fr 61001: 

[0101] 5 F;l/ • >)i7 • l"f J ?4 0<D&f3 

4 1 (D^mm^^J^-^i ifrp>#^.e»n?. tm 
it j , rs^'j , tp,kj x« r^^-y^>^ (#-^ 

36 6 9 0*fJS-r Sf^lt^a-^e 1 ~6 7 

[0102] ZLTcnzm^wim^J ^.—ivs i~6 

7 fT®J3Y>K^^5n5 m&ViWi-a Vic 

^25,~25 n (0 7) ic^zz^v— xmfrm 

V, Xt?-# 2 4 (07) a^ffl^fS^Wf*-* 

cne.©f-^%D^f^ -y • -9-—/^ • ^-7*v ? x-5' h 
3 2©/S— + • Ptf-y F 3 3&tffi^«Uil§I!S 1 4 

(07) WMLTMt57^f aX-ii 2 5 , ~ 
2 5 n Xt±7.tf— # 2 E D tCjlR&jMai-f §„ 

[0 10 3] tOJ:5tLTn*>y hglllcfc^T 

C fc g ftWftfrlbfcfT 3 C i: #T*£ & cfc 5 {c ft £ nr (/> 
[0104] (3-3) djJW Fgffl'N©^©/';^ 

y Xi»©n» 

±xEO^IfS©7';l/=ryxA(i, ccDcfc -5 ft n h^g 
1 ©09 4"©eW^^JL— 7 LTHg^nS. 
[0 10 5] 114*^1-^6 77(1, ±ffi©W# 
(0IJ*«:\ fTWt-r/W tcT&££tlfc#tU^3WF 



(«Atf. rstfTfKK-&«fcJ ft if) HIR©© 
^^Jx-^^JSRb, «C^-ft^D#iy h 3 3 

coxtr— Af^i'7tLttiTf-$?:Sit2i. cntc 
a^ft^nfciiM^A^ 6 ft sfsis&tffs-e e 

[0 10 6] SS«»c^fc-&fc»»3v>K*S=*«-r*fT 

it^o ztmnm^TM*. 01 o tenure vrm^^ 
7-rr^y 7 ofcT-otfKi^x^i: LTffl«^nrv^ 

-So 

[0 10 7] ^effWtx/i/TJi, SHIItf^l/ 7 3-^* 
*©«tdft*/^^— *filfcSr3v>TI2l 1 OKTjVTJ:? 
5« -Tftta^ feS««l*^©a»*#i:LTiB«©« 

v > , * ©jsnite bp l rzftsB'imzm&mcmfiir % 

<fc?KbTV>£o 

[0 10 8] SKfTtt^x^aMfiffif *tt!fi*££tt, 
0 1 4fc^-r«fe5k:S3S-rsci:^-e#S« > ft 

fc\ 01 4^^-r^tgRKi^f-vi/^^ffl-r^m^^ 

ti> ±3£©0 1 3tC^L^«fflS»^8 0©SiEJfc$# 
Sft-=>TV>3#, ^IWcliSSSfeOT'S^^o 01 

[0 10 9] C©0ljT?fcl\ /— F# "node XXX" fr<b 
mcDS— F^tDM&Zkftt LTltf (HAPPY) , j&L# 

(SAD) , SO (ANGER) Rtf£-rA7«/- h (TIMEOUT) 
*^64aT^5o fit, Bt>* (HAPPY) , ML H (S 
AD) , SO (ANGER) SO^-f^T 7 ^ h (TIMEOUT) 'N© 

wi&kfct LT©*ftWftSS[ffi*v ^-n^n happy>7 

0, SAD> 7 0, ANGER> 7 0St/TIME0UT=timout. 1 i L 
T#^.P>tlTV>§o CCT\ tinout. HitfeflT'feO, 0<J 

[0 1 10] 3:fc, /— F*^ "node XXX" 
ft/— FtUTs nodeYYY, nodeZZZ, nodeWW 
W, nodeVVV^ffl«*nT*5 0, ^©cfc 5 ft^&/- F 
fCttLTUffSftSfffta^ft^ft rA>^W (BANZA 
I) J > r^-fejAty (OTIKOMU) J , r&%,&% (BURUBU 
RU) J Rt>* r$,<t>" (AKUBI) J i: LTfijoarp»nT 

[0 1 1 1] CCT\ TTf^ (BANZAI) J ©«JitT» 
fix Tgtfj ^WSI^^nSfgfS (talk_happy) 
L, tZTz.mmmz&Z>7iWL<DWlft (motlon_banzai) 

^ c btiCl7lM : %:MZ>W)i / P (raotion_swingtail) %t"?> 
fe©t LTS8LTV>§o CCT' T§t>"j ©jKWS31* 
Lfc^|g^t-§rc46{c, ±a?Lfcj;3ft^i6ffl.«^nT 

ftto-fe. LTV>S58|g©7';l/=ry Xl*lcm~3^ 

[onajtft, m%&ts (otikomu) j <o&mff_ 
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miz, r«5L*j nmtHmztizstm (taik_ sa d) * 

U Sfct^^&t^LtffciifF (motlonjjiiji) «rf 

[0 113] $fe, r^S4S« (BURUBURU) J <Dmffil 
WHt> HSDJ jWBtiiaSlSftSfHg (talk_anger) £ 
U ^fc^OO/c^tca^TI/^K)^ (motion_burubur 
u) *t5tOi:Lt5£«LT^5. llT l~<g*5J CD 

[o i i 43 r*<t/ (akubi) j (DmmnW) 

ti, ffi&&<iIJBfc©Ta&< tffc-rSWlfF (raotion_aku 
bi) tLT^iLT^So 
[0 115] <l<D&5lcm&"If£%:&S— KtCfcV^TH 

ff«n*«fT»««sEii*nT*t), ^ojcaarfty-K 

[0 1 1 6] Ell 4 {C^Vf fi&jT'fct, Stf (HAPPY) ©Jg 
*fcit^C{il00%c96P£T T73^ (BANZAI) J ©*5? 

ffw-pawsnso 3=fc> nu (sad) ©i©-&, -r* 

fr*SAD©te#mj£©HBffifc SttS 7 O^rS^fc^fc 
H\ ioo%©6S*t ng-s&ty (otikohu) j ©SiMtK 
■eawsnso so (anger) <om&. -r&frOA 

ioo%©5»t n&s.s:s (buruburu) j (ommffWiw 

IjR^nSo fit, ^AT^h (TIMEOUT) Q« 
£\ f^:t)^TIMEOUTCOffl[^m^O@eiifc$n«tiniout. 

100% ©awe r&<tf (AKUB 
i) j o«STT»^aw*nso fc*5, £ti 

*5 ff s «ns mszm ic l t v en 
tcis^ftscfcaft^,, -r*fc>-&0y*tf, st; (happ 

Y) ©*§^C, r7JS (BANZAI) J cr>RWj^70%T-31« 
[0 117] J-X±©cfc 5 tt^lStT»I^^KDt>cffi«^* 
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[03] 



1 Choose the nunber o1 words of the sentence (random number between 2 and MAXWORDS) ; 

2 Create the words : 



3 For each word, choose the number of syllables 

4 (random number between 2 and MAXSYLL), and 

5 decides with probability PROBACCENT whether the word Is accented or not ; 

6 If the word is accented then choose randomly one 

7 of i ts syllables and mark it as accented ; 

8 Create the sy I lables : 

9 For each sy I table 

10 choose wether this is a CV or a CCV syllable 

11 (CV syllable have probability 0.8) ; 

12 instantiate the C's and V by picking randomly a 

13 consonnant or vowel in the phoneme database ; 

14 set the duration of each phoneme to MEANDUR + random (DURVAR) ; 

15 let e = MEAN PITCH + random (P I TCHVAR) 

IB set the pitch of consonnants to e - P1TCHVAR 

17 set the pitch of vowels to e + PITCHVAR 

18 if the syllable is accented then 

19 add DURVAR to the duration of its phonemes ; 

20 if DEFAU LT C 0 NT Oil R = rising 

21 set the pitch of consonants to MAXPITCH - PITCHVAR 

22 set the pitch of the vowel to MAXPITCH + PITCHVAR 

23 if DEFAULTCONTOUR = falling 

24 set the pitch of consonants to MAXPITCH f PITCHVAR 

25 set the pitch of the vowel to MAXPITCH - PITCHVAR 

26 if DEFAULTCONTOUR = stable 

27 set the pitch of phonemes to MAXPITCH 
28 



[0 4] [012] 



29 Change the contour of the last word : 

30 if not LASTWDRDACCENTED 

31 let e = PITCHVAR/2 

32 If CON TOUR LAST WORD = FALLING 

33 for each syllable in word 

34 add -(i+1)*e pitch of phonemes to their value 

(i = index of phoneme in syllable) 

36 if CONTOUR LASTwORD - RISING 

37 for each syllable in word 

38 add +(i+l)*e pitch of phonemes to their value 

39 (i = index of phoneme in syllable) 

40 9 = e + e 

41 else 

42 if CONTOUR LASTWORD = FALLING 

43 for each syllable in word 

44 add DURVAR to the duration of its phonemes ; 

45 set the pitch of consonants to MAXPITCH + PITCHVAR 

46 set the pitch of the vowet to MAXPITCH - PITCHVAR 

47 if CONTOURLASTWORD = RISING 

48 for each syllable in word 

49 add DURVAR to the duration of Its phonemes ; 

50 set the pitch of consonants to MAXPITCH - PITCHVAR 

51 set the pitch of the vowel to MAXPITCH + PITCHVAR 
52 

53 Set the loudness volume of the complete sentence to VOLUME. 




(16) ¥tm 2002-175091 

[06] 




(17) 



ftffl 2002-1 75091 



[0 8] 



37 
_J_ 



38 
__L_ 



1 — 

39 



35 



41 



40 

L— 



H 



80 



32 



36 



r~ 

34 



33 
± 



J 



31 



30 



1 1] 



- 1 fjtb*f*547¥vY 



70 



/ /S-jtajSfTjre^v / - 



71 

-L. 



<ftffl 2002-175091 



[09] 




(19) 



ftffl 2002-175091 



[010] 




(20) 
[0 13] 



ftffl 2002-175091 













— 1 

Di 1 


r— 

f 




node 100 








A 


B 


C 


D 




n 










node 120 


node 120 


node 1000 






node 600 










ACTION 1 


ACTION 2 


MOVE BACK 






ACTION 4 


| 1 


BALL 


SIZE 


0. 1000 


30K 












i 2 


PAT 








40% 






i- 




3 


HIT 








2QX 










4 


MOTION 










50% 








5 


OBSTACLE 


DISTANCE 


0.100 






100% 








6 




JOY 


50.100 








A 






7 




SUPRISE 


50.100 














ft 




SUONESS 


50.100 



































80 



[0 14] 



node XXX 










Co nd ition— 1 a be 1 : 










HAPPY 


happy >70 








SAD 


sad >70 








ANGER 


anger>70 








TIMEOUT 


timeout. 1 








Arc— label : 










BAZAJ 


node YYY 


talk haooy. 

motion banzai, 

motion swtngtail 






OTIKOMU 


node ZZZ 


talk sad, 

motion_Jjiiji 






BURUBURU 


node WWW 


talk_anger. 
moti on buruburu 






AKUBI 


node VW 


motion_akubi 






Probablliy-tabte: 












BANZAI 


OTIKOMU 


BURUBURU 


AKUBI 


HAPPY 


100 








SAD 




100 






ANGER 






100 




TIMEOUT 








100 



(si) int. ci. 7 wsmsA 

G 0 6 F 3/16 3 4 0 

G 1 0 L 13/08 



F I 

G 1 0 L 3/00 



(##) 



E 
H 



A(#*f) 2C150 BA11 CA01 CA02 CA04 DF03 
DF04 DF06 DF33 ED42 ED52 
EF16 EF23 EF29 EF33 EF36 

3C007 AS36 KS11 KS15 KS23 KS24 
KS31 KS36 KS39 KT01 LW12 
MT14 WA04 WA14 WB15 WB27 

5D045 AA07 AA20 



PATENT ABSTRACTS OF JAPAN 
(1 1 )Publication number : 2002-1 75091 
(43)Date of publication of application : 21 .06.2002 



(51)lnt.CI. G10L 13/00 
A63H 11/00 
B25J 5/00 
B25J 13/00 
G06F 3/16 
G10L 13/08 



(21 )Application number : 2000-372091 (71 )Applicant : SONY 



(22)Date of filing : 06.12.2000 (72)lnventor : SABE KOTARO 
OUDEYER PIERRE YVES 



(54) SPEECH SYNTHESIS METHOD AND APPARATUS AND ROBOT 
APPARATUS 

(57)Abstract: 

PROBLEM TO BE SOLVED: To make an auditory emotional expression which 
more closely approximates the emotional expression of a living being, etc. 
SOLUTION: The robot apparatus forms utterance sentences by speech 
synthesis by an emotion discriminating process step (step S1) of discriminating 
the emotion state of an emotion model, an utterance sentence outputting 
process step (step S2) for outputting the utterance sentences indicating the 
contents to be uttered as a speech, a parameter control process step (step S3) 
of controlling the parameters for speech synthesis, according to the emotion 
state discriminated by the emotion discriminating process step and a speech 
synthesizing process step (step S4) of inputting the utterance sentences 



outputted by the utterance sentence outputting process step to the speech 
synthesizing section and performing speech synthesis according to the 
controlled parameters. 
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* NOTICES * 



JPO and INPIT are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not 
reflect the original precisely. 

2. **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 

CLAIMS 
[Claim(s)] 

[Claim 1] The feeling distinction process which is the speech synthesis approach 
which compounds voice based on the information from the pronunciation subject 
who has a feeling model at least, and distinguishes the feeling condition of the 
above-mentioned pronunciation subject's above-mentioned feeling model, The 
utterance sentence output process which outputs the utterance sentence 
showing the content emitted as voice, and the parameter-control process which 
controls the parameter for speech synthesis according to the feeling condition 
distinguished by the above-mentioned feeling distinction process, The speech 
synthesis approach characterized by having the speech synthesis process 
which synthesizes voice based on the parameter by which inputted into the 
speech synthesis section the utterance sentence outputted by the 
above-mentioned utterance sentence output process, and control was carried 



out [ above-mentioned ]. 

[Claim 2] The above-mentioned utterance sentence is the speech synthesis 
approach according to claim 1 characterized by being the sentence of a 
meaningless content. 

[Claim 3] The above-mentioned utterance sentence output process is the 
speech synthesis approach according to claim 1 characterized by outputting the 
above-mentioned utterance sentence and supplying the above-mentioned 
speech synthesis section when the feeling condition of the above-mentioned 
feeling model exceeds a predetermined threshold. 

[Claim 4] The above-mentioned utterance sentence output process is the 

speech synthesis approach according to claim 1 characterized by outputting the 

above-mentioned utterance sentence obtained at random for every utterance, 

and supplying the above-mentioned speech synthesis section. 

[Claim 5] It is the speech synthesis approach according to claim 1 which the 

above-mentioned utterance sentence has two or more phonemes, changes, and 

is characterized by the above-mentioned parameter containing the persistence 

time of the above-mentioned phoneme, a pitch, and sound volume. 

[Claim 6] The above-mentioned pronunciation subject is the speech-synthesis 

approach according to claim 1 which is autonomous mold robot equipment 

which operates based on the supplied input, and is characterized by to have the 



feeling model which originates in the above-mentioned actuation as the 
above-mentioned feeling model, and to have the feeling model change process 
of opting for the above-mentioned actuation by changing the condition of the 
above-mentioned feeling model based on the above-mentioned input. 
[Claim 7] A feeling distinction means to be the voice synthesizer which 
compounds voice based on the information from the pronunciation subject who 
has a feeling model at least, and to distinguish the feeling condition of the 
above-mentioned pronunciation subject's above-mentioned feeling model, An 
utterance sentence output means to output the utterance sentence showing the 
content emitted as voice, and a parameter-control means to control the 
parameter for speech synthesis according to the feeling condition distinguished 
by the above-mentioned feeling distinction means, The voice synthesizer 
characterized by having the speech synthesis means from which the utterance 
sentence outputted by the above-mentioned utterance sentence output means is 
supplied, and synthesizes voice based on the parameter by which control was 
carried out [ above-mentioned ]. 

[Claim 8] The above-mentioned utterance sentence is a voice synthesizer 
according to claim 7 characterized by being the sentence of a meaningless 
content. 

[Claim 9] The above-mentioned utterance sentence output means is a voice 



synthesizer according to claim 7 characterized by outputting the 
above-mentioned utterance sentence and supplying the above-mentioned 
speech synthesis means when the feeling condition of the above-mentioned 
feeling model exceeds a predetermined threshold. 

[Claim 10] The above-mentioned utterance sentence output means is a voice 
synthesizer according to claim 7 characterized by outputting the 
above-mentioned utterance sentence obtained at random for every utterance, 
and supplying the above-mentioned speech synthesis means. 
[Claim 11] It is the voice synthesizer according to claim 7 which the 
above-mentioned utterance sentence has two or more phonemes, changes, and 
is characterized by the above-mentioned parameter containing the persistence 
time of the above-mentioned phoneme, a pitch, and sound volume. 
[Claim 12] The above-mentioned pronunciation subject is the voice synthesizer 
according to claim 7 which is autonomous mold robot equipment which performs 
actuation according to the supplied input, and is characterized by to have a 
feeling model change means opt for the above-mentioned actuation by having 
the feeling model which originates in the above-mentioned actuation as the 
above-mentioned feeling model, and changing the condition of the 
above-mentioned feeling model based on the above-mentioned input. 
[Claim 13] The feeling model which is robot equipment of the autonomous mold 



which operates based on the supplied input, and originates in the 
above-mentioned actuation, A feeling distinction means to distinguish the feeling 
condition of the above-mentioned feeling model, and an utterance sentence 
output means to output the utterance sentence showing the content emitted as 
voice, A parameter-control means to control the parameter for speech synthesis 
according to the feeling condition distinguished by the above-mentioned feeling 
distinction means, Robot equipment characterized by having the speech 
synthesis means from which the utterance sentence outputted by the 
above-mentioned utterance sentence output means is supplied, and synthesizes 
voice based on the parameter by which control was carried out 
[ above-mentioned ]. 

[Claim 14] The above-mentioned utterance sentence is robot equipment 
according to claim 13 characterized by being the sentence of a meaningless 
content. 

[Claim 15] The above-mentioned utterance sentence output means is robot 
equipment according to claim 13 characterized by outputting the 
above-mentioned utterance sentence and supplying the above-mentioned 
speech synthesis means when the feeling condition of the above-mentioned 
feeling model exceeds a predetermined threshold. 

[Claim 16] The above-mentioned utterance sentence output means is robot 



equipment according to claim 13 characterized by outputting the 

above-mentioned utterance sentence obtained at random for every utterance, 

and supplying the above-mentioned speech synthesis means. 

[Claim 17] It is robot equipment according to claim 13 which the 

above-mentioned utterance sentence has two or more phonemes, changes, and 

is characterized by the above-mentioned parameter containing the persistence 

time of the above-mentioned phoneme, a pitch, and sound volume. 

[Claim 18] Robot equipment according to claim 13 characterized by having a 

feeling model change means to opt for the above-mentioned actuation by 

changing the condition of the above-mentioned feeling model based on the 

above-mentioned input. 



DETAILED DESCRIPTION 

[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] This invention relates to the speech synthesis approach 
for generating the voice which an utterance subject outputs and a voice 
synthesizer, and the robot equipment that outputs voice to a list. 



[0002] 

[Description of the Prior Art] In recent years, the robot equipment of for example, 
the pet mold with which the appearance configuration was formed in animals, 
such as a dog and a cat, by imitating is offered. There are some which operate 
autonomously according to the condition of the information from the outside or 
the interior in such robot equipment. 

[0003] For the artificial intelligence (A.I. Artificial lntelligence:artificial 
intelligence) used for such robot equipment to realize intellectual functions, such 
as inference and decision, artificially, and to also realize functions, such as 
feeling and instinct, artificially further is tried. Among visual expression means, 
acoustic sense expression means, etc. as an expression means to the exterior 
of such artificial intelligence, using voice is mentioned as an example of an 
acoustic-sense-thing. 

[0004] In such robot equipment, the function to complain of own feeling to 
human beings (owner etc.) using the utterance is effective. This is because the 
pet dog and the favorite cat could know from the experience what kind of mood it 
was now and the one element has judged by utterance of a pet, although human 
being cannot understand directly what pets, such as a actual dog and a cat, 
have said. 
[0005] 



[Problem(s) to be Solved by the Invention] By the way, what is performing the 
acoustic sense feeling expression to the electronic sound (audible tone) as robot 
equipment currently supplied to the current commercial scene is known. It is that 
a short high sound expresses joy and the low sound carried out slowly 
specifically expresses sadness etc. And music is written beforehand, and by 
human being's subjectivity, these audible tones can be distributed to each 
feeling class, and are used for playback. Here, a feeling class is a class of the 
emotion which is glad and is classified as resentment etc. the acoustic sense 
feeling expression using the audible tone currently made conventionally — 
setting — (i) - mechanical (ii) — the power of expression (iii) which repeats the 
always same expression is suitable, or unknown — etc. — the actual condition is 
that are mentioned as a point that a point differs from a feeling expression of the 
pet of living things, such as a dog and a cat, greatly, and the further 
improvement is desired. 

[0006] Then, this invention is made in view of the above-mentioned actual 
condition, and aims at offer of robot equipment in the speech synthesis approach 
which enables an acoustic sense feeling expression which is near with the 
feeling expression of a living thing etc. and a voice synthesizer, and a list. 
[0007] 

[Means for Solving the Problem] The feeling distinction process which 



distinguishes the feeling condition of a pronunciation subject's feeling model in 
order that the speech synthesis approach concerning this invention may solve 
an above-mentioned technical problem, The utterance sentence output process 
which outputs the utterance sentence showing the content emitted as voice, and 
the parameter-control process which controls the parameter for speech 
synthesis according to the feeling condition distinguished by the feeling 
distinction process, It has the speech synthesis process which synthesizes voice 
based on the parameter which inputted into the speech synthesis section the 
utterance sentence outputted by the utterance sentence output process, and 
was controlled. 

[0008] Such a speech synthesis approach generates a pronunciation subject's 
utterance sentence based on the parameter for the speech synthesis controlled 
according to the feeling condition of an utterance subject's feeling model. 
[0009] Moreover, in order that the voice synthesizer concerning this invention 
may solve an above-mentioned technical problem A feeling distinction means to 
distinguish the feeling condition of a pronunciation subject's feeling model, and 
an utterance sentence output means to output the utterance sentence showing 
the content emitted as voice, It has a parameter-control means to control the 
parameter for speech synthesis according to the feeling condition distinguished 
by the feeling distinction means, and the speech synthesis means from which 



the utterance sentence outputted by the utterance sentence output means is 
supplied, and synthesizes voice based on the controlled parameter. 
[0010] A voice synthesizer equipped with such a configuration controls the 
parameter for speech synthesis by the parameter-control means according to 
the feeling condition distinguished by feeling distinction means distinguish the 
feeling condition of a pronunciation subject's feeling model, and the utterance 
sentence outputted by the utterance sentence output means is supplied, and 
synthesizes voice from it with a speech-synthesis means based on the 
controlled parameter. Thereby, a voice synthesizer generates a pronunciation 
subject's utterance sentence based on the parameter for the speech synthesis 
controlled according to the feeling condition of an utterance subject's feeling 
model. 

[0011] Moreover, in order that the robot equipment concerning this invention 
may solve an above-mentioned technical problem The feeling model resulting 
from actuation, and a feeling distinction means to distinguish the feeling 
condition of a feeling model, An utterance sentence output means to output the 
utterance sentence showing the content emitted as voice, and a 
parameter-control means to control the parameter for speech synthesis 
according to the feeling condition distinguished by the feeling distinction means, 
It has the speech synthesis means from which the utterance sentence outputted 



by the utterance sentence output means is supplied, and synthesizes voice 
based on the controlled parameter. 

[0012] Robot equipment equipped with such a configuration controls the 
parameter for speech synthesis by the parameter-control means according to 
the feeling condition distinguished by feeling distinction means distinguish the 
feeling condition of the feeling model resulting from actuation, and the utterance 
sentence outputted by the utterance sentence output means is supplied, and 
synthesizes voice with a speech-synthesis means based on the controlled 
parameter from it. Thereby, robot equipment generates a pronunciation subject's 
utterance sentence based on the parameter for the speech synthesis controlled 
according to the feeling condition of an utterance subject's feeling model. 
[0013] 

[Embodiment of the Invention] First, the semantics of adopting the feeling 
expression with voice as the speech synthesis approach concerning this 
invention and equipment, and a list as functions, such as robot equipment of a 
pet mold, in advance of explanation of the gestalt of desirable operation of robot 
equipment and the feeling expression with suitable voice are explained. 
[0014] (1) Adding a feeling expression to utterance as a function in feeling 
expression robot equipment with voice etc. works very effectively, in order to 
raise intimate nature with human being. Moreover, a stimulus can be required of 



human being by satisfaction and the dissatisfaction guide peg of not only the 
improvement in sociality but self only being shown. Such a function turns into the 
function to act effectively in robot equipment with a learning function. 
[0015] About whether on the other hand the acoustic feature of the feeling which 
human being has, and the voice uttered has correlation the report (Fairbanks G. 
(1940) -) of Fairbanks Recent experimental investigations of vocal pitch in 
speech and Journal of the Acoustical Society A report of of America, 
(11):457-466. and Burkhardt and others () [ Burkhardt F.and Sendlmeier W.F., ] 
[ Verification of Acoustic Correlates ] There are of Emotional Speech unsing 
Formant Synthesis, ISCA Workshop on Speech and Emotion, Belfast 2000., etc. 
Thus, the report by many researchers is known. 

[0016] According to these reports, it turns out that utterance has correlation with 
psychological conditions or some fundamental emotion classes. Moreover, it is 
reported that it is difficult to find a difference about the specific feeling which is 
conversely afraid surprise and is [ sadness / tedium, ] in it etc. It is connected 
with a certain physical condition about a certain emotion, and expectable 
physical effect is easily brought about to utterance. 

[0017] For example, when a certain man memorizes the resentment, fear, and 
joy, the sympathetic nervous system calls forth, and a heart rate and blood 
pressure rise. In opening, it gets dry and, occasionally shaking takes place to 



muscles. At such time, utterance becomes quick greatly, and it will have energy 
strong against a high frequency component. Moreover, when a certain man 
senses tedium and sadness, the parasympathetic nervous system is stirred up. 
A heart rate and blood pressure decrease and many saliva is secreted. As a 
result, utterance becomes what has the late pitch carried out slowly. And since 
these bodily features is universal, it is considered that the correlation which 
approaches neither a race nor culture exists between the fundamental emotion 
and the acoustical property of utterance. 

[0018] moreover Enumeration of the word which does not have semantics to 
Japanese people and Americans Abelin and others reports [ the result of having 
experimented in how many each other feeling it being made speaking by various 
feeling and being recognized ] (). [ Abelin A., Allwood J., ] [ Cross Linguistic 
Interpretation ] of Emotional Prosody, Workshop on Emotions in Speech, ISCA 
Workshop on Speech and Emotion, and Belfast 2000., the report (TickleA. ~) of 
Tickle English and Japanese Speaker's Emotion Vocalizations and Recognition : 
A Comparison HighlightingVowel Quality and SCAWorkshop It is in on Speech 
and Emotion and Belfast 2000. From the result in such a report, two points that it 
is about 60% not much poor [ (ii) recognition result which the recognition rate of 
feeling does not change by the difference in (i) language ] are clarified. 
[0019] When these research results are taken into consideration, although 



transfer of the feeling by the meaningless word which does not need transfer of 
semantics between for example, human being and robot equipment is 
dramatically ambiguous, it turns out that it is possible. For example, the 
recognition rate of the feeling is made into about 60%. Moreover, it is shown that 
it is also possible to compound such utterance by modeling correlation of an 
emotion and an acoustic feature. 

[0020] With the gestalt of operation of this invention, it is made to speak by 
carrying out the utterance based on such an acoustic feature by making feeling 
express. Furthermore, with the gestalt of operation of this invention, utterance 
without (ii) semantics like the (i) speech which is different each time (iii) is 
realized. 

[0021] here, although drawing 1 is a flow chart which shows the basic 
configuration of the gestalt of operation of the speech-synthesis approach 
concerning this invention and assumes for example, the robot equipment which 
has a feeling model, a speech-synthesis means, and a pronunciation means at 
least as a pronunciation subject, it is limited to this — not having — various robot 
equipments and various computers A.I. Artificial Intelligence other than a robot 
(artificial intelligence) etc. — of course, application is also possible In addition, a 
feeling model is explained in full detail later. 

[0022] In this drawing 1 , the feeling condition of a pronunciation subject's feeling 



model is distinguished at the first step S1. The condition (feeling condition) of a 
feeling model changes according to a surrounding environment (external factor) 
or an internal condition (inner factor), and, specifically, it distinguishes whether it 
is in ataraxia, the resentment, sadness, joy, composure, and ******** about this 
feeling condition. 

[0023] Here, as a behavioral model, it has the probability state-transition model 
(for example, model which has a state transition table so that it may mention 
later) inside, and in the case of robot equipment, it has the transition probability 
table from which each condition differs with the value of a recognition result, 
feeling, or instinct, and changes to it to the following condition according to that 
probability, and the action related with this transition is outputted to it. 
[0024] The expressive behavior of the joy and sadness by feeling is described by 
this probability state-transition model (or probability transition table), and the 
feeling expression with voice (based on utterance) is included as one of the 
expressive behavior of this, therefore — as one element of the action for which it 
opts when a behavioral model refers to the parameter showing the feeling 
condition of a feeling model by this example — a feeling expression - it is - a 
part of action decision section — distinction of a feeling condition will be 
performed as a function. 

[0025] In addition, this invention is not limited to this example and speech 



synthesis which expresses the distinguished feeling condition with voice is 
performed by step S1 at a next step that distinction of the feeling condition of a 
feeling model should just be performed at least. 

[0026] The utterance sentence which expresses with the following step S2 the 
content emitted as voice is outputted. This step S2 may be before step S1 or 
after step S3 mentioned later. Moreover, an utterance sentence new each time 
may be generated and you may make it choose at random either of two or more 
utterance sentences which were generated beforehand and prepared. However, 
it is needed in the gestalt of operation of this invention that it is a meaningless 
utterance sentence. Although it is difficult to realize the dialogue which had 
semantics actually when applying to robot equipment, especially this When it 
can realize with an easy configuration if it is utterance of a meaningless word, 
and a feeling expression moreover follows It takes into consideration that can 
make it sensed that it is fully having a dialog also in the meaningless word, and 
the room of the imagination of the direction of a meaningless word by the side of 
a hearer spreads, and a sense of closeness and an intimate feeling increase 
rather than the content of utterance with unsuitable semantics etc. Moreover, by 
performing generation or selection of an utterance sentence at random, 
utterance synthesized voice from which and reproduced will come to be different 
each time, and will always sense fresh. 



[0027] Thus, the utterance sentence outputted in this step S2 is a sentence 
which consists of random words, and, specifically, is realized by making each 
word into random syllable, the consonant whose syllable here is a phoneme -- it 
is referred to as valve flow coefficient or conical cup value, combining C and 
Vowel V. The phoneme is prepared beforehand, and all those phonemes are 
controlled by the gestalt of operation according to the distinction result of a 
feeling condition, although it has the persistence time and the pitch as a 
parameter which were fixed at first. By control of such a parameter according to 
the distinction result of a feeling condition, it becomes the utterance by which the 
feeling expression was made. Control of the parameter according to the 
distinction result of such a feeling expression is explained in full detail later. 
[0028] In addition, although the utterance sentence outputted is unrelated to the 
feeling condition of a feeling model, or its distinction result, it may adjust the 
utterance sentence outputted to some extent according to a feeling condition 
etc., or you may make it control generation of an utterance sentence, or the 
selection processing itself in the gestalt of this operation. 

[0029] Next, according to the distinction result of the feeling condition in the 
above-mentioned step S1, the parameter for speech synthesis is controlled by 
step S3. The parameters for speech synthesis are things, such as the 
persistence time of an above-mentioned phoneme, and a pitch or sound volume, 



and a feeling expression is made to perform by changing these parameters 
according to the distinction result of a feeling condition, for example, ataraxia, 
the resentment, sadness, joy, composure, etc. The parameter combination table 
corresponding to each feeling (ataraxia, the resentment, sadness, joy, 
composure, etc.) as an above-mentioned distinction result is created beforehand, 
and, specifically, switching these tables according to the feeling distinguished 
actually is mentioned. About the table prepared according to each feeling, the 
example is shown later. 

[0030] In the following step S4, it synthesizes voice from the utterance sentence 
outputted at the above-mentioned step S2 according to delivery and the 
parameter controlled by the above-mentioned step S3 to a speech synthesizer 
(speech synthesizer: speech synthesizer). The obtained voice time series data 
from which it synthesized voice are emitted as actual voice by being sent to a 
loudspeaker through a D/A converter, amplifier, etc. For example, when it is 
robot equipment, such processing is made in the so-called virtual robot, and 
utterance as which the feeling at that time is expressed comes to be made from 
a loudspeaker. 

[0031] According to the gestalt of fundamental operation of this invention 
explained above, according to the feeling relevant to a physical condition, by 
controlling the parameters for speech synthesis (the duration of a phoneme, a 



pitch, sound volume, etc.) In order to be able to perform the utterance 
accompanied by a feeling expression and to choose a phoneme at random, it is 
necessary to give semantics to neither a word nor the sentence itself. And yet 
while being able to make the utterance which is different whenever it sounds like 
utterance, and changes a part of above-mentioned parameter at random or 
compounds by determining the combination of a phoneme, and the die length of 
a word and a sentence at random, since there are few parameters to control, 
mounting is simple. 

[0032] (2) Explain a synthetic algorithm with the meaningless word used as 
feeling, the synthetic algorithm feeling of a meaningless word, and an utterance 
sentence to a detail. In the gestalt of operation, the place made into the target of 
a synthetic algorithm is generating the utterance sentence without semantics like 
a speech which is different each time. And it is making a feeling expression 
follow on such an utterance sentence. 

[0033] A speech synthesizer (speech synthesizer: speechsynthesizer) or a 
speech synthesis system is used for generation of such an utterance sentence. 
The inputs of a speech synthesizer (speech synthesizer: speech synthesizer) or 
speech synthesis cis- TEMUHE are a duration to a list and each phoneme, 
target pitch, hitting time (it expresses at the percentage to a duration) of a 
phoneme, etc. The outline of an algorithm of realizing such speech synthesis is 



as follows. 

[0034] (2-2) Generation of the utterance sentence of the generation meaningless 
word of an utterance sentence is realized in constituting with a random word. 
Furthermore, random syllable constitutes each word, the consonant whose 
syllable is a phoneme here — it shall come to combine C and Vowel V and shall 
be expressed as valve flow coefficient or a conical cup value Here, it has the 
phoneme as a list. And all the phonemes that it has as a list are first registered 
with the duration and pitch which were fixed. 

[0035] For example, a certain phoneme "b" is expressed by the value of "448 10 
150 80 158", and is registered into the list. Here, "448" shows that the duration of 
a phoneme "b" is 448ms. Moreover, it is shown that the following "10" and its 
following "150" reach 150Hz in 448ms of 10% of persistence time. Moreover, the 
following "80" and its following "158" show reaching 158Hz to 448ms of 80% of 
persistence time. Thus, all the phonemes of a list are expressed. 
[0036] The syllable expressed by association with the phoneme "b" given by 
"131 80 179", and the phoneme "@" given by "77 20 200 80 229" and the 
phoneme "b" given by "405 80 169" is shown in drawing 2 . This example shows 
as syllable which combined each phoneme which has a discontinuous relation 
actually so that it might become continuously. 

[0037] A feeling expression comes to be made by the utterance sentence 



because the phoneme used as what constitutes such syllable can add 
modification in accordance with each feeling expression. It is changed for a 
feeling expression of the duration and pitch which are also the information which 
specifically shows individuality or a property of a phoneme which was mentioned 
above. 

[0038] The utterance sentence constituted from such a phoneme divides roughly, 
it is the combination of a word and the word is made into the combination of 
syllable, and further, the syllable is made into the combination of a phoneme and 
constituted. The processing in <1>- <5> is explained in detail below the whole 
phase which constitutes such an utterance sentence. 

<1> The number of the words in a sentence is decided first. For example, it 
determines as a random number between 20 - MAXWORDS. MAXWORDS is 
the maximum number of the word which constitutes a text, and is a parameter 
for speech synthesis. 

<2> Each word is generated. Specifically, it is first determined by the probability 
(PROBACCENT) whether an accent is in a text at a word. 

[0039] The procedure of the following <3-1> - <3-7> determines the syllable of a 
word, and its phoneme, and a word is determined. 

<3-1> The number of the syllable in each word is decided. For example, it 
determines as a random number between 2 - MAXSYLL. MAXSYLL is the 



maximum number of the syllable which constitutes a word, and is a parameter 
for speech synthesis. 

<3-2> Here, if it is a word with an accent, it will choose at random [ one ] of the 
syllable, and an accent will be marked. 

<3-3> It determines as a thing of valve flow coefficient expression or a conical 
cup value expression of each syllable. For example, it is determined that the 
syllable of valve flow coefficient expression is chosen by 0.8% of probability. 
<3-4> The consonant and vowel which are assigned to C and V of valve flow 
coefficient chosen such or conical cup value are read from a phoneme database 
(or phoneme list) at random. 

<3-5> The duration of each phoneme is calculated by MEANDUR+random 
(DURVAR). Here, MEANDUR is the persistence time of immobilization and 
random (DURVAR) is a value determined as a random number. Here, 
MEANDUR and DURVAR are the parameters for speech synthesis. 
<3-6-1> Count of the pitch of a phoneme is calculated by 
e=MEANPITCH+random (PITCHVAR). Here, MEANPITCH is the pitch of 
immobilization and random (PITCHVAR) is a value determined as a random 
number. Here, MEANPITCH and PITCHVAR are made into the parameter 
determined according to feeling. 

<3-6-2> If a phoneme is a consonant here, the pitch of a consonant is obtained 



as e-PITCHVAR. Moreover, if a phoneme is a vowel, a vowel pitch is obtained 
as e+PITCHVAR. 

<3-7-1> If an accent is in syllable, DURVAR will be added at a duration. 
<3-7-2> And when there is an accent and it is DEFAULTCONTOUR=rising, the 
pitch of a consonant is made into MAXPITCH-PITCHVAR and a vowel pitch is 
made into MAXPITCH+PITCHVAR. Moreover, when it is 
DEFAULTCONTOUR=falling, the pitch of a consonant is made into 
MAXPITCH+PITCHVAR and a vowel pitch is made into MAXPITCH-PITCHVAR. 
Furthermore, when it is DEFAULTCONTOUR= stable, the pitch of a consonant 
and a vowel is set to MAXPITCH. DEFAULTCONTOUR shows the property 
(CONTOUR) of syllable. Moreover, MAXPITCH is a parameter for speech 
synthesis. 

[0040] The syllable of a word and its phoneme are determined by the procedure 
of above <3-1> - <3-7>. And processing which changes contour (profile) of the 
word which finally comes to the last of an utterance sentence is performed. 
<4-1> If there is nothing in the word of the last of a text in an accent, it will be 
made e=PITCHVAR/2. Here, when it is CONTOURLASTWORD=faling, about 
each syllable, -(1+1) *e is added and it considers as e=e+e. I shows the index of 
a phoneme. Moreover, when it is parameter CONTOURLASTWORD=rising, 
about each syllable, +(1+1) *e is added and it considers as e=e+e. 



<4-2> On the other hand, if it is in the last word in an accent, when it will be 
CONTOURLASTWORD=falling, DURVAR is added to the duration of each 
syllable. And the pitch of a consonant is made into MAXPITCH+PITCHVAR and 
a vowel pitch is made into MAXPITCH-PITCHVAR. When it is parameter 
CONTOURLASTWORD=rising, DURVAR is added to the duration of each 
syllable. And the pitch of a consonant is made into MAXPITCH-PITCHVAR and 
a vowel pitch is made into MAXPITCH+PITCHVAR. 

<5> Finally the volume of the whole sentence is set as VOLUME. Here, 
VOLUME is a parameter for speech synthesis. 

[0041] An utterance sentence comes to be generated by the processing in every 
above phases. And since some parameters used for the decision of an utterance 
sentence are given with the random number, the meaningless word without 
semantics which is different each time can be generated. And a feeling 
expression comes to be made by such utterance sentence by giving various 
parameters which were mentioned above according to feeling. 
[0042] In addition, description of a program (source code) for hardware to realize 
processing which was mentioned above is shown in drawing 3 and drawing 4 . A 
part for the pre-stage of a program is shown in drawing 3 , and a part for a 
post-stage is shown in drawing 4 . 

[0043] (2-2) A feeling expression can be given to an utterance sentence by 



controlling the parameter which is given according to each feeling and which 
was used in the algorithm of text generation which carried out parameter **** 
according to feeling. Here, as feeling that an utterance sentence is expressed 
such for example, ataraxia, the resentment, sadness, joy, or composure (calm, 
anger, sadness, happiness, comfort) is mentioned. In addition, it cannot be 
overemphasized that it is not limited to the feeling listed here. 
[0044] For example, such an emotion can express rousing (Arousal) and a 
potency (valence) on the feature space used as an element, respectively, for 
example, it comes out as are shown in drawing 5 , and the resentment, sadness, 
and the field in which it is glad or settles (anger, sadness, happiness, comfort) 
are constituted on the feature space which uses rousing (Arousal) and a potency 
(valence) as an element and the field of ataraxia (calm) is constituted by the core, 
for example, "the resentment (anger)" is expressed as rousing as negative - 
having -- "-- feeling sad (sadness) - " -- if it is not "rousing, it is expressed as 
negative." 

[0045] The combination (at least duration (DUR), pitch (PITCH), sound volume 
(VOLUME), etc. of phoneme) table of the parameter beforehand determined 
corresponding to each feeling, such as resentment, sadness, joy, and 
composure, is shown in the following tables. Such a table is beforehand 
generated based on the special feature of each feeling. 



[0046] 
[A table 1] 



[0047] 
[A table 2] 



[0048] 
[A table 3] 



[0049] 
[A table 4] 



[0050] 
[A table 5] 



[0051] Thus, control according to the feeling of the parameter used for speech 
synthesis is realized by switching the table which consists of a parameter 
corresponding to each feeling prepared beforehand according to the feeling 
distinguished actually. 

[0052] And the utterance sentence by which the feeling expression was made 
comes to be generated by making the speech synthesis which used the 
parameter of the table chosen according to feeling such. And although human 
being does not understand the content of utterance of robot equipment itself 
when robot equipment utters the utterance sentence of a meaningless word by 
which the feeling expression generated in this way was made, the feeling of 
robot equipment can be known. And since such utterance becomes what is 
further different each time, human being can always sense a dialogue with robot 
equipment fresh. Next, the robot equipment which is the gestalt of operation of 
this invention is explained, and the mounting gestalt of the algorithm of 



above-mentioned utterance to such robot equipment is explained concretely 
> after that. 

[0053] In addition, although it has realized with the gestalt of operation by 
switching the table which consists of a parameter which has prepared control of 
the parameter according to feeling beforehand, corresponding to feeling by 
actual feeling, it cannot be overemphasized that control of the parameter 
according to feeling is not limited to the gestalt of this operation. 
[0054] (3) Explain to a detail the example which applied this invention to the 
autonomous mold pet robot of 4 guide pegs as a gestalt of more concrete 
operation of this invention below the configuration of the example (3-1) robot 
equipment of the robot equipment by the gestalt of this operation, referring to a 
drawing. Feeling and an instinct model are introduced into the software of this 
pet mold robot equipment, and it can be made to do [ obtaining action more near 
a living thing, or ]. Although the robot which operates actually is used with the 
gestalt of this operation, if the utterance by the meaningless word is computer 
system with a loudspeaker, it can realize easily and it is an effective function in 
the field of the interaction (or dialogue) of human being and a machine. 
Therefore, there is no applicability of this invention what is restricted to a robot 
system. 

[0055] The head unit 4 and the tail section unit 5 are connected with the front 



end section and the back end section of the idiosoma unit 2, respectively, and 
the robot equipment as an example is constituted while considering as the 
so-called pet robot of the configuration which imitated the "dog" and connecting 
the leg units 3A and 3B, 3C, and 3D with front and rear, right and left of the 
idiosoma unit 2, respectively, as shown in drawing 6 . 

[0056] As shown in the idiosoma unit 2 at drawing 7 , the control section 16 
formed by connecting CPU (Central Processing Unit)10, DRAM (Dynamic 
Random Access Memory)11, a flash ROM (Read Only Memory) 12, PC 
(Personal Computer) card interface circuitry 13, and a digital disposal circuit 14 
mutually through an internal bus 15 and the dc-battery 17 as a source of power 
of this robot equipment 1 are contained. Moreover, the angular-velocity sensor 
18, an acceleration sensor 19, etc. for detecting the sense of robot equipment 1 
and the acceleration of a motion are contained by the idiosoma unit 2. 
[0057] Moreover, the CCD (Charge Coupled Device) camera 20 for picturizing 
an external situation to the head unit 4, The touch sensor 21 for "it strokes" and 
the physical influence of "striking" from a user to detect a carrier beam pressure, 
The distance robot 22 for measuring the distance to the body located ahead, 
LED (Light Emitting Diode) (not shown) equivalent to the microphone 23 for 
collecting alien frequencies, the loudspeaker 24 for outputting voice, such as a 
cry, and the "eye" of robot equipment 1 etc. is arranged in the predetermined 



location, respectively. 

[0058] Furthermore, Actuators 251 -25n and Potentiometers 261 -26n for free 
frequency are arranged in the joining segment of the joint part of each leg unit 
3A-3D, each joining segment of each leg unit 3A-3D and the idiosoma unit 2, the 
head unit 4, and the idiosoma unit 2, and the list by the joining segment of tail 5A 
of the tail section unit 5, respectively. For example, Actuators 251 -25n have the 
servo motor as a configuration. Leg unit 3A - 3D are controlled by actuation of a 
servo motor, and it changes in a target position or actuation. 
[0059] And LED and 251 -25n of each actuator are connected with the digital 
disposal circuit 14 of the control section 16 through the hubs 271-27n 
corresponding to various sensor lists, such as these angular-velocity sensor 18, 
an acceleration sensor 19, a touch sensor 21 , a distance robot 22, a microphone 
23, a loudspeaker 24, and each potentiometers 261 -26n, respectively, and direct 
continuation of CCD camera 20 and the dc-battery 17 is carried out to the digital 
disposal circuit 14, respectively. 

[0060] A digital disposal circuit 14 incorporates sensor data, and the image data 
and voice data which are supplied from each above-mentioned sensor one by 
one, and carries out sequential storing of these through an internal bus 15 in the 
predetermined location in DRAM11, respectively. Moreover, a digital disposal 
circuit 14 incorporates the dc-battery residue data showing the dc-battery 



residue supplied from a dc-battery 17 with this one by one, and stores this in the 
predetermined location in DRAM11. x 

[0061] Thus, each sensor data stored in DRAM11, image data, voice data, and 
dc-battery residue data are used in case CPU 10 performs motion control of this 
robot equipment 1 after this. 

[0062] In practice, at the time of the first stage when the power source of robot 
equipment 1 was switched on, CPU 10 reads directly the control program stored 
in the memory card 28 or flash ROM 12 with which the PC Card slot which the 
idiosoma unit 2 does not illustrate was loaded through the PC card interface 
circuitry 13, and stores this in DRAM11. 

[0063] Moreover, CPU 10 judges [ after this ] the situation of self and a perimeter, 
the existence of the directions from a user, and influence, etc. from a digital 
disposal circuit 14 based on each sensor data by which sequential storing is 
carried out, image data, voice data, and dc-battery residue data to DRAM11 as 
mentioned above. 

[0064] Furthermore, CPU 10 can make the head unit 4 able to shake vertically 
and horizontally, can move tail 5A of the tail section unit 5, or makes it act by 
making the required actuators 251 -25n drive based on the decision result 
concerned to make it walk by making each leg unit 3A-3D drive etc. while it opts 
for the action which stores in this decision result and DRAM11, and continues 



based on 

[0065] Moreover, in this case, CPU 10 generates voice data if needed, by giving 
this to a loudspeaker 24 as a sound signal through a digital disposal circuit 14, 
makes the voice based on the sound signal concerned output outside, or turns 
on, switches off or blinks above-mentioned LED. 

[0066] Thus, in this robot equipment 1, it is made as [ act / according to the 
situation of self and a perimeter the directions from a user, and influence / it / 
autonomously]. 

[0067] (3-2) the software configuration of a control program -- here comes to 
show the software configuration of the above-mentioned control program in 
robot equipment 1 to drawing 8 . In this drawing 8 , the device driver layer 30 ist 
located in the lowest layer of this control program, and consists of device driver 
sets 31 which consist of two or more device drivers. In this case, each device 
driver is the object allowed ** which carries out direct access to the hardware 
used by usual computers, such as CCD camera 20 ( drawing 7 ) and a timer, 
and processes in response to interruption from corresponding hardware. 
[0068] Moreover, the ROBOTIKKU server object 32 With the virtual robot 33 
which becomes by the software group which offers the interface for being 
located in the lowest layer of the device driver layer 30, for example, accessing 
hardware, such as various above-mentioned sensors and Actuators 251 -25n 



With the BAWA manager 34 who becomes by the software group which 
manages the change of a power source etc. It consists of a device driver 
manager 35 who becomes by the software group which manages other various 
device drivers, and a dither INDO robot 36 which becomes by the software 
group which manages the device of robot equipment 1. 

[0069] The manager object 37 consists of an object manager 38 and a service 
manager 39. The object manager 38 is a software group which manages starting 
and termination of each software group contained in the ROBOTIKKU server 
object 32, the middleware layer 40, and the application layer 41, and a service 
manager 39 is a software group which manages connection of each object 
based on the initial entry between each object described by the connection file 
stored in the memory card 28 ( drawing 7 ). 

[0070] The middleware layer 40 is located in the upper layer of the ROBOTIKKU 
server object 32, and consists of software groups which offer the fundamental 
function of these robot equipments 1, such as an image processing and speech 
processing. Moreover, the application layer 41 is located in the upper layer of the 
middleware layer 40, and consists of software groups for opting for action of 
robot equipment 1 based on the processing result processed by each software 
group which constitutes the middleware layer 40 concerned. 
[0071] In addition, the concrete software configuration of the middleware layer 



40 and the application layer 41 is shown in drawing 9 , respectively. 
[0072] As shown in drawing 9 , the middleware layer 40 The object for noise 
detection, for temperature detection, The recognition system 60 which has the 
input semantics converter module 59 etc. in the object for brightness detection, 
the object for scale recognition, the object for distance detection, the object for 
position detection, the object for touch sensors, the object for motion detection, 
and each signal conditioning module 50 for color recognition - 58 lists, It consists 
of output systems 69 which have each signal conditioning modules 61-67 for the 
object for position management, the object for tracking, the object for motion 
playback, the object for a walk, the object for a fall return, the object for LED 
burning, and sound playback etc. in output semantics converter module 68 list. 
[0073] Each signal conditioning modules 50-58 of the recognition system 60 
incorporate the data with which it corresponds of each sensor data by which 
reading appearance is carried out from DRAM 11 ( drawing 7 ) with the virtual 
robot 33 of the ROBOTIKKU server object 32, or image data and voice data, 
perform predetermined processing based on the data concerned, and give a 
processing result to the input semantics converter module 59. Here, the virtual 
robot 33 is constituted by the predetermined protocol as a part which carries out 
transfer or conversion of a signal. 

[0074] The input semantics converter module 59 It is based on the processing 



result given from each [ these ] signal conditioning modules 50-58. "The fall was 
detected", [ "it is "noisy", hot / "hot" /, and bright", "the ball having been detected", 
and ] The self of "it was stroked", "it having been struck", "the scale of C-E-G 
having been heard", "the body which moves having been detected", "having 
detected the obstruction", etc. and a surrounding situation, the command from a 
user, and influence are recognized, and a recognition result is outputted to the 
application layer 41 ( drawing 7 ). 

[0075] Application layer 41., as shown in drawing 10 , it consists of five modules, 
the behavioral model library 70, the action change module 71, the study module 
72, the feeling model 73, and the instinct model 74. Here, the feeling model 73 is 
a model to which the condition of the feeling that a condition changes with the 
stimuli from the outside etc. is changed, and superposition of the feeling 
expression to an utterance sentence which was mentioned above according to 
the feeling determined with such a feeling model is made. Moreover, the monitor 
of such a condition of the feeling model 73 or instinct model 74 grade, i.e., 
distinction of the condition etc., is made by the control means of CPU10 grade. 
[0076] As shown in drawing 11 , "when a dc-battery residue decreases", "when 
avoiding an obstruction, and expressing feeling", the behavioral model library 70 
is made to correspond to the condition item of the shoes "at the time of detecting 
a ball" etc. chosen beforehand, respectively, and the behavioral models 701 -70n 



which became independent, respectively "are formed [ "a fall return is carried 
out" and ] in it." 

[0077] And these behavioral models 701 -70n The time of a recognition result 
being given from the input semantics converter module 59, respectively, The 
parameter value of the corresponding emotion currently held like the 
after-mentioned at the feeling model 73 if needed when fixed time amount has 
passed, after the last recognition result is given, It opts for the action which 
continues while referring to the parameter value of the corresponding desire 
currently held at the instinct model 74, respectively, and a decision result is 
outputted to the action change module 71. 

[0078] In the case of the gestalt of this operation, in addition, each behavioral 
models 701 -70n As the technique of opting for the next action As opposed to the 
arc ARC1 - ARCn which connect between to each node NODE0 - NODEn for to 
other nodes NODE0 of which - NODEn it changes from one node (condition) 
NODE0 as shown in drawing 12 - NODEn The algorithm called the finite 
stochastic automaton determined probable based on the transition probability 
P1-Pn set up, respectively is used. 

[0079] Concretely, each behavioral models 701 -70n are made to correspond to 
the node NODE0 which forms the self behavioral models 701 -70n, respectively - 
NODEn, respectively, and have the state transition table 80 as shown in drawing 



13 for every these node NODEO - NODEn. 

[0080] In this state transition table 80, the input event (recognition result) made 
into transition conditions in that node NODEO - NODEn is listed by the line of an 
"input event name" at a priority, and the further conditions about that transition 
condition are described by the "data name" and the corresponding train in the 
line of the "data range." 

[0081] therefore, in the node NODE100 expressed in the state transition table 80 
of drawing 13 When the recognition result of "detecting a ball (BALL)" is given 
The range of "magnitude (SIZE)" of the ball given with the recognition result 
concerned is "0 to 1000", When the recognition result of "detecting an 
obstruction (OBSTACLE)" is given, they have been conditions for that the range 
of "the distance (DISTANCE)" to the obstruction done with the recognition result 
concerned is "0 to 100" to change to other nodes. 

[0082] Moreover, in this node NODE100, when there is no input of a recognition 
result, it also sets. The inside of each emotion held at the feeling model 73 and 
the instinct model 74 which 701 -70n of behavioral models refers to periodically, 
respectively, and the parameter value of each desire, it was held at the feeling 
model 73 - glad (JOY) - " - "-- surprised (SURPRISE) - " - or - "-- feeling 
sad (SUDNESS) - " -- when the range of one of parameter value is "50 to 100", 
it can change to other nodes. 



[0083] moreover — a state transition table 80 — "- others, while the node name 
which can change from the node NODE0 - NODEn in the train of the "transition 
place node" in the column of transition probability" of NODOHE is listed It is 
described by the part where it corresponds in the column of transition probability" 
of NODOHE, respectively, the transition probability to each of other node 
NODE0 which can change when all the conditions described by the line of an 
"input event name", a "data value", and the "range of data" are met - NODEn - 
"- others -- the action which should be outputted in case it changes to the node 
NODE0 - NODEn - "- others - it is described by the line of "output action" in the 
column of transition probability" of NODOHE. in addition - "- others -- the sum 
of the probability of each line in the column of transition probability" of NODOHE 
is 100 [%]. 

[0084] therefore, in the node NODE100 expressed in the state transition table 80 
of drawing 13 for example, when the recognition result that it carries out 
"detecting a ball (BALL)", and the range of "SIZE (magnitude)" of the ball is "0 to 
1000" is given It can change to "a node NODE120 (node 120)" by the probability 
of "30 [%]", and action of "ACTION 1" will be then outputted. 
[0085] When they are constituted as a lot of nodes NODE0 described as such 
[ respectively ] a state transition table 80 - NODEn(s) are connected, and a 
recognition result is given from the input semantics converter module 59, each 



behavioral models 701 -70n opt for the next action probable using the state 
transition table of the node NODE0 - NODEn, and are made as [ output / to the 
action change module 71 / a decision result ]. 

[0086] The action change module 71 shown in drawing 10 chooses from each 
behavioral models 701 -70n of the behavioral model library 70 the action 
outputted from the high behavioral models 701 -70n of the priority beforehand 
defined among the actions outputted, respectively, and sends out the command 
(this is hereafter called action command.) of the purport which should perform 
the action concerned to the output semantics converter module 68 of the 
middleware layer 40. In addition, in the gestalt of this operation, priority is set up 
highly about 701 -70n of behavioral models written by the bottom in drawing 11 . 
[0087] Moreover, the action change module 71 notifies that the action was 
completed based on the completion information of action given from the output 
semantics converter module 68 after the completion of action to the study 
module 72, the feeling model 73, and the instinct model 74. 
[0088] On the other hand, the recognition result of carrier beam instruction is 
inputted as influence from a user, such as the study module 72 "was struck" 
among the recognition results given from the input semantics converter module 
59, and "it having been stroked." 

[0089] And based on the advice from this recognition result and the action 



change module 71, the study module 72 reduces the manifestation probability of 
that action, when "struck" (scolded), and when "stroked" (praised), it changes 
corresponding behavioral models [ in the behavioral model library 70 / 701 -70n ] 
corresponding transition probability so that the manifestation probability of that 
action may be raised. 

[0090] on the other hand, the feeling model 73 - "-- glad Coy) - " - "-- feeling 
sad (sadness) — " — "the resentment (anger)" — "— surprised (surprise) — " — 
"dislike (disgust)" — and — afraid (fear) — " -- the parameter with which the 
strength of the emotion is expressed for every emotion is held about a total of six 
emotions. And the feeling model 73 updates the parameter value of each 
[ these ] emotion periodically based on the advice from the specific recognition 
result to which it is given from the input semantics converter module 59, 
respectively, such as "it having been struck" and "it having been stroked", and 
elapsed time and the action change module 71 etc. 

[0091] The recognition result to which the feeling model 73 is specifically given 
from the input semantics converter module 59, The amount of fluctuation of the 
action and its emotion when being computed by predetermined operation 
expression based on the elapsed time after updating last time etc. of the robot 
equipment 1 at that time **E [t], The multiplier which expresses the sensibility of 
E [t] and its emotion for the parameter value of the current emotion is set to ke. 



(1) By the formula, as parameter value [ of the emotion in the following period ] E 
[t+1] is computed and this is replaced with parameter value [ of the current 
emotion ] E [t], the parameter value of the emotion is updated. Moreover, the 
feeling model 73 updates the parameter value of all emotions like this. 
[0092] 
[Equation 1] 

[0093] In addition, it is decided beforehand what the advice from each 
recognition result or the output semantics converter module 68 has effect of on 
amount of fluctuation **[ of the parameter value of each emotion ] E [t]. For 
example, the recognition result of "having been struck" has big effect on amount 
of fluctuation **[ of the parameter value of the emotion of the "resentment" ] E [t], 
and the recognition result of "having been stroked" has big effect on amount of 
fluctuation **[ of the parameter value of the emotion of "joy" ] E [t]. 
[0094] Here, the advice from the output semantics converter module 68 is the 
so-called feedback information (the completion information of action) of action, 
and is the information on the appearance result of action, and the feeling model 
73 changes feeling also using such information, this - for example, the feeling 
level of the resentment falls by action of "barking" like — they are things. In 



addition, the advice from the output semantics converter module 68 is inputted 
also into the study module 72 mentioned above, and the study module 72 
changes behavioral models [ 701 -70n ] corresponding transition probability 
based on the advice. 

[0095] In addition, feedback of an action result may be made with the output 
(action to which feeling was added) of the action change modulator 71. 
[0096] on the other hand, "motion avarice (exercise)", "love avarice (affection)", 
"appetite (appetite)", and "the curiosity (curiosity) of the instinct model 74" are 
mutually-independent ~ the parameter with which the strength of the desire is 
expressed for these the desires of every is held about four desires the bottom. 
And the instinct model 74 updates the parameter value of these desires 
periodically based on the recognition result to which it is given from the input 
semantics converter module 59, respectively, the advice from elapsed time and 
the action change module 71, etc. 

[0097] The instinct model 74 specifically about "motion avarice", "love avarice", 
and "curiosity" The amount of fluctuation of that the desire of the when being 
computed by predetermined operation expression based on the advice from a 
recognition result, elapsed time, and the output semantics converter module 68 
etc. deltal [k], The parameter value of the current desire as a multiplier ki 
showing the sensibility of I [k] and its desire As parameter value [ of that desire in 



the following period ] I [k+1] is computed using (2) types with a predetermined 
period and this result of an operation is replaced with parameter value [ of that 
current desire ] I [k], the parameter value of that desire is updated. Moreover, the 
instinct model 74 updates the parameter value of each desire except "appetite" 
like this. 
[0098] 
[Equation 2] 

[0099] In addition, it is decided beforehand what the advice from a recognition 
result and the output semantics converter module 68 etc. has effect of on 
amount of fluctuation **[ of the parameter value of each desire ] I [k], for example, 
it has effect to amount of fluctuation **[ of the parameter value of the "fatigue" ] I 
[k] with the big advice from the output semantics converter module 68. 
[0100] In addition, in the gestalt of this operation, it is regulated so that each 
emotion and the parameter value of each desire (instinct) may be changed in the 
range from 0 to 100, respectively, and the value of multipliers ke and ki is also 
set up according to the individual for each [ an emotion and ] the desire of every. 
[0101] On the other hand, abstract action commands, such as it being [ which is 
given from the action change module 71 of the application layer 41 as mentioned 



above ] "advance", "it being glad", the output semantics converter module 68 of 
the middleware layer 40 "cries", as shown in drawing 9 , or "tracking (a ball is 
pursued)", are given to the signal conditioning modules 61-67 with which the 
output system 69 corresponds. 

[0102] And these signal conditioning modules 61-67 The servo command value 
which should be given to the actuators 251 -25n ( drawing 7 ) in order to perform 
the action based on the action command concerned, if an action command is 
given, Or the actuation data given to LED of a "eye" are generated, the voice 
data of the sound outputted from a loudspeaker 24 ( drawing 7 ) — and — 
Sequential sending out of these data is carried out at the actuators 251 -25n 
which correspond through the virtual robot 33 and digital disposal circuit 14 
( drawing 7 ) of the ROBOTIKKU server object 32 one by one, a loudspeaker 24, 
or LED. 

[0103] Thus, in robot equipment 1, it is made based on the control program as 
[ perform / the situation of self (interior) and a perimeter (exterior), the directions 
from a user, and autonomous action according to influence ]. 
[0104] (3-3) Robot equipment can be constituted as the algorithm of the 
utterance to robot equipment carried out mounting ****. Argo RISUMU of 
above-mentioned utterance is mounted as a sound playback module 67 in 
drawing 9 of such robot equipment 1. 



[0105] By the sound playback module 67, the sound output command (for 
example, "speak with joy" etc.) determined in the part (for example, behavioral 
model) of a high order is received, actual voice time series data are generated, 
and data are transmitted to the virtual robot's 33 loudspeaker device in order. 
The utterance sentence which consists of a meaningless word in which the 
feeling expression was made from the loudspeaker 24 shown in drawing 7 in 
robot equipment by this is emitted. 

[0106] The behavioral model (henceforth an utterance behavioral model) which 
generates the utterance command doubled with feeling is explained. The 
utterance behavioral model is prepared as a behavioral model of 1 in the 
behavioral model library 70 shown in drawing 10 . 

[0107] In the utterance behavioral model, the content of utterance is determined 
using the state transition table 80 as always shown in drawing 10 with reference 
to the newest parameter value based on such each parameter value from the 
feeling model 73 or the instinct model 74. Namely, it is made to perform 
utterance action adapted to the feeling at the time of transition, using the value of 
feeling as transition conditions from a certain condition. 

[0108] The state transition table which an utterance behavioral model uses can 
be expressed as shown in drawing 14 . In addition, the state transition tables 
used for the utterance behavioral model shown in drawing 14 do not differ 



substantially, although notation formation of the state transition table 80 shown 
in above-mentioned drawing 13 differs. The state transition table shown like 
drawing 14 is explained. 

[0109] this example - a node - "node XXX" from - it is glad as transition 
conditions to other nodes (HAPPY), and feels sad (SAD), and the resentment 
(ANGER) and a time-out (TIMEOUT) are given, and - glad (HAPPY) - feeling 
sad (SAD) - the concrete numeric value as the resentment (ANGER) and 
transition conditions to a time-out (TIMEOUT) - respectively - It is given as 
HAPPY>70, SAD>70, ANGER>70, and TIMEOUT=timout.1 . Here, timout.1 is a 
numeric value, for example, it is a value which shows predetermined time. 
[0110] moreover, the action which nodeYYY, nodeZZZ, nodeWWW, and 
nodeVW are prepared and is performed to such each node as a node to which 
a node can change from "node XXX" - respectively - "BANZAI (BANZAI)" - "it 
falling (OTIKOMU)" "-- shiveringly (BURUBURU) - " - and it is assigned as "a 
yawn (AKUBI)." 

[0111] Here, the expressive behavior of "Banzai! (BANZAI)" is defined as what 
carries out the utterance (talkjiappy) by which the feeling expression of "the joy" 
is carried out, and carries out actuation (motion_banzai) of the banzai by the 
nose gear etc., and carries out actuation (motion_swingtail) which wags a tail 
further. In order to carry out the utterance which carried out the feeling 



expression of "joy" here, the parameter for a feeling expression of the joy 
currently prepared beforehand which was mentioned above is used. That is, joy 
is uttered based on the algorithm of utterance currently explained previously. 
[0112] Moreover, the expressive behavior of "falling (OTIKOMU)" is defined as 
what carries out the utterance (talk_sad) by which the feeling expression of "the 
sadness" is carried out, and carries out the so-called timid actuation 
(motionjjiiji). In order to carry out the utterance which carried out the feeling 
expression of "sadness" here, the parameter for a feeling expression of the 
sadness currently prepared beforehand which was mentioned above is used. 
That is, sadness is uttered based on the algorithm of utterance currently 
explained previously. 

[0113] moreover — "-- shiveringly (BURUBURU) -- " ~ expressive behavior is 
defined as what carries out actuation (motion_buruburu) which carries out the 
utterance (talk_anger) by which the feeling expression of the "resentment" is 
carried out, and is trembling for the resentment. In order to carry out the 
utterance which carried out the feeling expression of the "resentment" here, the 
parameter for the feeling expression of resentment which was mentioned above 
and which is prepared beforehand is used. That is, the resentment is uttered 
based on the algorithm of utterance currently explained previously. 
[0114] Moreover, the expressive behavior of "a yawn (AKUBI)" is defined as 



actuation (motion_akubi) which yawns since it is tedious without nothing. 
[0115] Thus, each action performed in each node which can change is defined, 
and the transition to such each node is determined by the probability table. That 
is, the probability table on which the action probability at the time of agreeing on 
transition conditions was described has determined the transition to each node. 
[01 16] In the example shown in drawing 14 , when glad (HAPPY) (i.e., when the 
value of HAPPY exceeds 70 made into a predetermined threshold), it is chosen 
by the expressive behavior of "Banzai! (BANZAI)" by 100% of probability. 
Moreover, when feeling sad (i.e., when the value of SAD exceeds 70 made into 
a predetermined threshold) (SAD), it is chosen by the expressive behavior of 
"falling (OTIKOMU)" by 100% of probability, moreover - the case where the 
value of the **** rate ANGER exceeds 70 made into a predetermined threshold 
in the case of the resentment (ANGER) -- 100% of probability - "-- shiveringly 
(BURUBURU) - " -- expressive behavior is chosen. And in the case of a time-out 
(TIMEOUT) (i.e., when the value of TIMEOUT is set to timout.1 made into a 
predetermined threshold), the expressive behavior of "a yawn (AKUBI)" is 
chosen by 100% of probability, in addition - although the case where 100% of 
probability shows altogether the case where action is chosen, namely, action 
surely speaks is made into the example in this example - this — definition - ** — 
there are nothings. That is, when glad (HAPPY), you may make it, choose action 



of "Banzai! (BANZAI)" at 70% for example. 

[0117] By defining the state transition table of an utterance behavioral model as 
mentioned above, it comes be made [ controlling the utterance corresponding to 
feeling freely according to the input of other sensors, or a robot's condition, or ] 
to robot equipment. 

[0118] In addition, as a parameter controlled by feeling, the persistence time, a 
pitch, and sound volume were mentioned as the example, and the gestalt of 
above-mentioned operation explained them. However, the text configuration 
factor influenced by feeling can also be used as a parameter, without being 
limited to this. 

[0119] Moreover, by explanation of the gestalt of above-mentioned operation, 
the feeling model of robot equipment was glad and the case where it was 
constituted by feeling, such as resentment, was explained. However, it is not 
limited to a feeling model being constituted by the feeling mentioned as such an 
example, and other factors which affect feeling can also constitute. And the 
parameter which constitutes a text is controlled by such other factors in this case. 
[0120] 

[Effect of the Invention] The feeling distinction process that the speech synthesis 
approach concerning this invention distinguishes the feeling condition of a 
pronunciation subject's feeling model, The utterance sentence output process 



which outputs the utterance sentence showing the content emitted as voice, and 
the parameter-control process which controls the parameter for speech 
synthesis according to the feeling condition distinguished by the feeling 
distinction process, By having the speech synthesis process which synthesizes 
voice based on the parameter which inputted into the speech synthesis section 
the utterance sentence outputted by the utterance sentence output process, and 
was controlled Based on the parameter for the speech synthesis controlled 
according to the feeling condition of an utterance subject's feeling model, a 
pronunciation subject's utterance sentence is generable. 

[0121] Moreover, a feeling distinction means by which the voice synthesizer 
concerning this invention distinguishes the feeling condition of a pronunciation 
subject's feeling model, An utterance sentence output means to output the 
utterance sentence showing the content emitted as voice, and a 
parameter-control means to control the parameter for speech synthesis 
according to the feeling condition distinguished by the feeling distinction means, 
By having the speech synthesis means from which the utterance sentence 
outputted by the utterance sentence output means is supplied, and synthesizes 
voice based on the controlled parameter According to the feeling condition 
distinguished by feeling distinction means to distinguish the feeling condition of a 
pronunciation subject's feeling model, the parameter for speech synthesis is 



controlled by the parameter-control means. The utterance sentence outputted by 
the utterance sentence output means is supplied, and can synthesize voice with 
a speech synthesis means based on the controlled parameter. Thereby, a voice 
synthesizer can generate a pronunciation subject's utterance sentence based on 
the parameter for the speech synthesis controlled according to the feeling 
condition of an utterance subject's feeling model. 

[0122] Moreover, the feeling model with which the robot equipment concerning 
this invention originates in actuation, A feeling distinction means to distinguish 
the feeling condition of a feeling model, and an utterance sentence output 
means to output the utterance sentence showing the content emitted as voice, A 
parameter-control means to control the parameter for speech synthesis 
according to the feeling condition distinguished by the feeling distinction means, 
By having the speech synthesis means from which the utterance sentence 
outputted by the utterance sentence output means is supplied, and synthesizes 
voice based on the controlled parameter According to the feeling condition 
distinguished by feeling distinction means to distinguish the feeling condition of 
the feeling model resulting from actuation, the parameter for speech synthesis is 
controlled by the parameter-control means. The utterance sentence outputted by 
the utterance sentence output means is supplied, and can synthesize voice with 
a speech synthesis means based on the controlled parameter. Thereby, robot 



equipment can generate a pronunciation subject's utterance sentence based on 
the parameter for the speech synthesis controlled according to the feeling 
condition of an utterance subject's feeling model. 



DESCRIPTION OF DRAWINGS 
[Brief Description of the Drawings] 

[Drawing 1] It is the flow chart which shows the basic configuration of the gestalt 
of operation of the speech synthesis approach concerning this invention. 
[Drawing 2] It is drawing showing the relation between the persistence time of 
each phoneme, and a pitch. 

[Drawing 3] It is a program for creation of the utterance sentence by speech 
synthesis, and is drawing showing a part for the first portion. 
[Drawing 4] It is a program for creation of the utterance sentence by speech 
synthesis, and is drawing showing a part the second half. 

[Drawing 5] It is drawing showing the relation of the class of each feeling in a 
feature space or a plane of action. 

[Drawing 6] It is the perspective view showing the appearance configuration of 
the robot equipment which is the gestalt of operation of this invention. 



[Drawing 7] It is the block diagram showing the circuitry of above-mentioned 
robot equipment. 

[Drawing 8] It is the block diagram showing the software configuration of 
above-mentioned robot equipment. 

[Drawing 9] It is the block diagram showing the configuration of the middleware 
layer in the software configuration of above-mentioned robot equipment. 
[Drawing 10] It is the block diagram showing the configuration of the application 
layer in the software configuration of above-mentioned robot equipment. 
[Drawing 11] It is the block diagram showing the configuration of the behavioral 
model library of an above-mentioned application layer. 

[Drawing 12] It is drawing used in order to explain the finite stochastic automaton 
used as the information for the action decision of robot equipment. 
[Drawing 13] It is drawing showing the state transition table prepared for each 
node of a finite stochastic automaton. 

[Drawing 14] It is drawing showing the state transition table of an utterance 
behavioral model. 
[Description of Notations] 

1 Robot Equipment, 10 CPU, 14 Digital Disposal Circuit, 24 Loudspeaker, 70 
Behavioral Model, 73 Feeling Model 



